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Summary of Progress 


In this report, we will focus on the results obtained during the Pi’s recent sabbatical leave 
at the Swiss Federal Institute of Technology (ETH) in Zurich, Switzerland, from January I 
1995 through June 30, 1995. During this time period the PI and his postdoctoral research 
associate, Dr. Lance C. Perez, supervised one semester project and one diploma Project 
involving three ETH students: Dieter Arnold, Guido Meyerhans, and Jan Seghers. Both 
projects investigated various properties of TURBO codes, a new form of concatenated coding 
that achieves near channel capacity performance at moderate bit error rates. The semester 
project, entitled “The Realization of the Turbo-Coding System” and authored by Mr_Araold 
and Mr. Meyerhans, involved a thorough simulation study of the performance of TURBO 
codes and verified the results claimed by previous authors. A copy of the final report 
this project is included as Appendix A to this report. The diploma project, entitled “On the 
Free Distance of Turbo Codes and Related Product Codes” and authored by Mr. Seghers, 
includes an analysis of TURBO codes and an explanation for their remarkable performance. 
A copv of the final report for this project is included as Appendix B to this report. Recently, 
the PI presented a paper on this research at the 1995 Allerton Conference [8]. As a result of 
this presentation, we have been invited to submit a paper on TURBO codes to J^e special 
issue of the IEEE Transactions on Information Theory on "Codes and Complexity” scheduled 
for publication in November 1996. The following sections contain a brief summary of this 

research. 



1 Introduction 

The discovery of TURBO codes and the near capacity performance reported by the 
discoverers [1] has stimulated significant research effort to fully understand this new 
coding scheme [2]-[6]. In this paper, the performance of TURBO codes is explained in 
terms of the code’s distance spectrum. These results explain both the near capacity 
performance of the TURBO codes and the observed “error floor” for moderate and high 
signal-to-noise ratios (SNR’s). 


2 Performance Bounds 


In order to make clear the distinction between TURBO codes and convolutional codes, 
it is useful to consider these codes as block codes. To this end, the inputs sequences are 
restricted to sequences of length N , where N corresponds to the size of the interleaver 
in the TURBO encoder. With finite length input sequences of length N, a (2,1 ,v) 
convolutional code may be viewed as a block code with 2 N codewords of length 2 {v + N). 

The bit error rate (BER) performance of the convolutional code with maximum like- 
lihood (ML) decoding is upper bounded by 
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where Wi and d* are the information weight and total Hamming weight, respectively, of 
the i th codeword. Collecting codewords of the same total Hamming weight and defining 
the average information weight per codeword as 

w d 

Wd ~ JV 

where w d is the total information weight of all codewords of weight d and N d is the 
number of codewords of weight d, yields 
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If a convolutional code has N d codewords of weight d caused by an information se- 
quence x (D) whose first one occurs at time 0, then it also has a codeword of weight d 
caused by the information sequence Dx(D), a codeword of weight d caused by the infor- 


mation sequence 
increases we have 
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and the bound on the BER of a convolutional code with ML decoding becomes 



which is the standard union bound for ML decoding. For this reason, most efforts to find 
good convolutional codes have focused on finding codes that maximize the free distance 
dfree and minimize the multiplicity N d for a given rate and constraint length. 

The performance of the TURBO code with ML decoding is also bounded by 
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where to* and are the information weight and total Hamming weight, respectively, of 
the I th codeword. However, in the TURBO encoder the pseudorandom interleaver maps 
the sequence x(D ) to x'(D) and the sequence Dx(D) to a sequence xf t (D) that is different 
from Dx'(D) with very high probability. Thus, the input sequences x{D) and Dx(D) 
produce different codewords with different Hamming weights. 

Collecting codewords of the same total Hamming weight, the bound on the BER for 
TURBO codes becomes 
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For TURBO codes with pseudorandom interleavers, N d w d is much less than N and 
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where 


w d N d 


N 

is called the effective multiplicity of codewords of weight d. 


3 Asymptotic Performance 

In this section, the performance of TURBO codes for high SNR’s is addressed using the 
bounds developed in the previous section. In order to make the points more concrete, 
the performance of the TURBO code is compared to the performance of a maximum free 
distance (MFD) (2, 1, 14) convolutional code using simulations and analytical results. 
Figure 1 shows simulation results for the (2, 1,14) code with soft-decision Viterbi decoding 
and the TURBO code of [1] with iterative decoding and N = 65536. 

It is well known that for high SNR’s, the performance of a convolutional code with 
ML decoding is determined by the free distance asymptote. That is, for high SNR s the 
first term of the union bound is dominant. For the MFD (2,1,14) code the first term in 
the bound is given by 



where d free = 18, Nj ree = 18, and w° free = 137. In Figure 2, simulation results for this 
code are shown along with the free distance asymptote and it is clear that as the SNR 
gets large the asymptote accurately predicts the performance of the code. 

For the TURBO code, the first term in the union bound is given by 
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where N is the size of the interleaver. For N = 65536, is has been found [7] that the 
TURBO code has N free = 3 and w = 2. (This may vary slightly depending on which 
pseudorandom interleaver is used.) In Figure 3, the free distance asymptote is plotted 
together with simulation results. Once again, the asymptote is seen to be a good estimate 
of the performance of the code for high SNR’s. Thus, it can be concluded that the error 
floor’ of TURBO codes is due to their relatively low free distance and the corresponding 
free distance asymptote. 

It can be shown [7] that as the interleaver size N increases, the number of free distance 
codewords approaches a constant Kf ree that is much less than N. That is, 

lim N free — ^ free' 
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(This is not true for rectangular interleavers!) Thus, the ‘error floor’ can be lowered by 
increasing the size of the interleaver. Simulation results have shown that this is indeed 
the case. 


4 Performance at Low SNR’s 

Having demonstrated that TURBO codes have a relatively small free distance, their 
excellent performance at low SNR’s may seem even more surprising. In this section, 
the performance of TURBO codes at low SNR’s is explained by examining the code s 
distance spectrum. 

Returning to the MFD (2, 1, 14) convolutional code and Figure 2, it is clear that for 
low and moderate SNR’s there is a significant gap between the free distance asymptote 
and the simulated performance. That is, the real coding gain is less than the asymptotic 
coding gain. The (2, 1, 14) code has the following distance spectrum 
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where N§ is the number of paths of weight d and w $ is the total information weight. In 
Figure 4, the contribution of each term in the distance spectrum to the overall perfor- 
mance of the code is plotted and compared to simulation results. It is easily seen that 
for SNR’s less than 2.5 dB the free distance asymptote is the dominant term in the per- 
formance. This is due to the rapid growth of the multiplicities in the distance spectrum 
which causes the high weight paths to become dominant for low and moderate SNR s. 

In view of the analysis of the (2, 1, 14) code, it is reasonable to suggest that the excel- 
lent performance of the TURBO code at low and moderate SNR’s is due to a relatively 
“thin” distance spectrum. That is, the TURBO code is able to follow the free distance 
asymptote at lower SNR’s because the multiplicities of higher weight codewords are small 
enough that the free distance asymptote remains the dominant term in the bound. The 
distance spectrum of TURBO codes is the result of a process called “spectral thinning 



in which the interleaver effectively moves many lower weight codewords to higher weight 
codewords. This theory is supported by simulation results and actual calculations of the 

distance spectrum of some simple TURBO codes [7]. 

In Figure 5, simulation results are shown along with the contributions to the union 
bound of the first five terms of the distance spectrum of the TURBO code. (The distance 
spectrum of the TURBO was calculated assuming only information sequences of weight 
two.) Note that the free distance asymptote remains the dominant term in the bound 
even for low SNR’s. Thus, the excellent performance of TURBO codes at low SNR’s is 
a result of their relatively thin distance spectrum which enables the code to follow the 
free distance asymptote for low and moderate SNR’s. 

5 Primitive TURBO Codes 

For sufficiently large interleavers, it can be shown that the first several terms of the dis- 
tance spectrum are determined solely by information sequences of weight 2 [7]. As the 
size of the interleaver increases, the number of terms in the distance spectrum determined 
solely by weight 2 information sequences increases. Consequently, for a given interleaver 
size, the performance of TURBO codes can be improved by choosing the feedback poly- 
nomial in the encoder such that weight 2 information sequences generate high weight 

codewords. . , 

Using primitive polynomials as the feedback polynomial in the encoder causes weight 

2 information sequences to generate high weight codewords, thereby increasing the free 
distance of the TURBO code for larger interleaver sizes and improving the performance 
at moderate to high SNR’s. Figure 6 shows simulation results for two TURBO codes with 
an interleaver size of 400 and memory v = 4 encoders. As expected, the TURBO code 
with the primitive feedback polynomial performs better at moderate and high SNR’s due 
to the increased free distance. 


6 Conclusion 

The excellent performance of TURBO codes may be explained in terms of the distance 
spectrum of the code. The ‘error floor’ observed in simulations of TURBO codes is a 
manifestation of the free distance asymptote. Since TURBO codes have relatively ow 
free distances, the free distance asymptote dominates performance at moderate and low 
error rates. The ‘error floor’ may be lowered by increasing the size of the interleaver. 
The exceptional performance of TURBO codes at low SNR’s is due to 'spectral thinning 
and the resultant ability of the code to follow the free distance asymptote at moderate 
and low SNR’s. Finally, for a fixed interleaver siv.e, the performance can be improved by 
using primitive feedback polynomials in the encoder. This increases the free distance of 
the overall code and results in better performance at moderate to high SNR’s. 
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Figure 2: Asymptotic performance of the (2, 1, 14) code. 
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Figure 3: Asymptotic performance of the TURBO code. 
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Zusammenfassung 


1993 prasentierte eine franzosische Forschergruppe ein neues Codierungsverfahren, genannt 
Turbo Code. Dieses erlaubt zuverlassige Datenkommunikation auch fur Signal-/Rausch- 
verhaltnisse sehr nahe bei der Shannon Limit. Dieser Bericht erklart zuerst die Struktur des 
Turbo Code. Spater werden Wege aufgezeigt, welche die Leistungsfahigkeit des Codes 
vergrdssem. Dieses wird erreicht durch Anderungen an den Encodem und Verbesserungen 
am Interleaver. Zum Schluss wird noch die Klasse von Interieavem definiert, welche es 
erlaubt, beide Encoder des Turbo Code Encoders im gleichen Zustand enden zu Iassen. 


Abstract 


In 1993 a coding scheme called Turbo Coding was proposed by a French research group. It 
allows almost reliable data communication close to the Shannon limit. The structure of the 
Turbo Code is explained. Methods that achieve better performance through modifications of 
the encoders and the interleaver are discussed. Finally, the class of interleavers which allows 
both encoders of the Turbo Code to end in the same state will be defined. 


Zurich, July 14 th 1995 


Dieter Arnold 


Guido Meyerhans 
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Introduction 


1.1 Overview 

Chapter 1 is dedicated to an overview of the report and the problems addressed in this 
project. After this introductionary chapter, a description of the Turbo Coding System (TCS) 
as presented in [1] and its mathematical background are given in Chapter 2. 

Chapter 3 is dedicated to convolutional codes. Some general aspects of recursive 
convolutional encoders and their structure will lead directly to some rules for finding 
improved encoders for the TCS. Bounds for these improved encoders, which result in better 
performance than the encoder used in [1], are given at the end of the chapter. 

Interleavers influence the performance of a TCS in two ways. First, it is not possible to 
terminate both component encoders of the TCS at the same time with a typical interleaver. 
In order to terminate both encoders, interleavers have to be constructed according to 
special rules. Second, in order to achieve high performance the ability to move the elements 
in the interleaver has to be maximized. In chapter 4, it will be shown how compromises can 
be achieved for these contradictory demands. 

Simulations were carried out with three different kinds of quantization. The performance of 
the TCS with these different quantization schemes is explained in chapter 5. 

In chapter 6, improvements on the TCS are presented. These improvements follow directly 
from the knowledge obtained from the three previous chapters. 

The report ends with chapter 7, where general comments and concluding remarks are 
made. 

For a detailed presentation of the simulation results the interested reader is asked to look at 
appendix A. Simulation results are presented in appendix B. 


1.2 Tasks 

The following pages contain a copy of the tasks given at the beginning of this semester 
project. 



2 Introduction to Turbo Codes 


The aim of this chapter is to give an introduction to Turbo Codes and a basic explanation of 
their performance. To this end, we make use of the Turbo Coding System (TCS) presented 
in [1]. Throughout this chapter we follow closely the paper of P. Robertson [2] to clarify 
many questions that arose in the original Turbo Code paper [1]. 


2 . 1 The MAP algorithm 

In this section, a complete derivation of the Bahl, et al., maximum-a-posteriori (MAP) 
algorithm is given. This derivation is important for two reasons. 

First, the MAP algorithm has to be modified to be used with the systematic feedback 
encoders of the Turbo Coding System. Second, the modified MAP algorithm presented in 
[1] is unnecessarily complicated. A simpler version was presented in [6], which issued 
throughout the literature on Turbo Codes. 


2.1.1 Why and how the MAP algorithm has to be modified for recursive 
encoders 

The MAP Algorithm is derived in its most general form in [5]. Where ist application to 
nonrecursive encoders was also discussed. Nonrecursive codes are characterized by the 
following features: 

1) The input is equal to the first element of the state S at time k, that is 

d k = S 0 (k) where S k = (S 0 (k),Si(k),...,S M -i(k)). 

2) To transfer from any state at time k-1 to a particular state S k = m at time k it is 
required that the input bits become all zeroes or all ones. In other words, the 
states can be divided into two subsets according to the inputs required for a 
transition from time k-1 to time k (Figure 2.1) 



Recursive encoders do not have these features. Since the state at time k, S^, is defined by 
the input at time k, dk, and the state at time k-i, Sn, it is possible to reach state Sk with a 
one or a zero input depending on the previous state Sk-|. As a consequence of this, the 
states of a systematic feedback encoder cannot be divided into subsets based on the input 
(Figure 2.1). As a further consequence, the a posteriori probability cannot be calculated as 
£P(S k =m,Rr) 

P(d v = 0|R? ) = it — ( A? = subset of all states with S 0 (k) = d k ) (2. 1) 

k 11 P(S N =0,R?) 

In addition the decoding rule 

d Jo ifp(d k+ (K)>0.5 (2 . 2 ) 

[l otherwise 

cannot be applied as in [5], where R^ denotes the received bits from time 1 to N (= end of 
transmission). 



Figure 2.1: Example of a trellis of a non-recursive code (a) and a recursive code (b) 

Instead, for recursive encoders the logarithm of the ratio of the a posteriori probability 
(APP) of each information bit being a i to the APP of it being a 0 is calculated. That is. 


(2.3) 


A k = A(d k ) = log 


P(d k =l|RD 


P(d k =0|R^) 

is calculated and the decoding rule 
1, A k >0 


d k = 


0, A k <0 


(2.4) 


2.1.2 Derivation of the extrinsic information 


The expression for A k , given in [5] as 


A(d k ) = log 
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where dk denotes the data input into the encoder at time k, will now be derived. 


(2.5) 


First, define 


a k (m) = P(S k =mR[) = P(S k = m,Rf ) • 


1 

P(Rf) 


P k (m) — P(R k+1 |S k = m) • 


1 

P(R? + ,|Rf) 


Yj(R k ,m,m' ) = P(d k = i,R k ,S k = m,|S k _ ( = m' ) 
where state m at time k is denoted by S k = m. 
Recall Bayes rule. 


P(B)A) = 


P(A|B) • P(B) 
P(A) 


and the expressions for the conditional probability, 

P(A,B) 

P(A) 



Combining these two equalities we obtain the following well known relation 

P( A, B) = P(A|B) • P(B) 

which turns out to be very useful for the following derivation. 

Proof: 


( 2 . 6 ) 

(2.7) 

( 2 . 8 ) 


(2.9) 



( 2 . 10 ) 


In order to prove (2.5) one can simply prove 

P(d k =i|R^) = ^^y i (R lc ,m',m)-a k _ l (m’)p k (m) 

m m‘ 

To do so, the left side is first expanded over all states m = 0...M-1 and then (2.9) is applied. 
This results in 
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P(RD 


p(d k = i|Rr , )=x p < d >c = ‘A = m| R ^ =X p ( d k = *’ s k = m - R r )• 
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Since the observations are independent of time, this can be rewritten as 
^TP(d k =i,S k =m,R t »R k+I )- k N . 

m * tKj ,K k+1 ) 

Applying (2.9) to the numerator and the denominator gives 
£P0C,K = i.S k = rn,Rf ) P(d k = i,S k = m.Rf) 

m . 

P(R^,|Rf)-P(Rf) 

Using the property that events after time k are not influenced by the observation of Rf and 
bit d k if the state Sk is known (i.e. Markov property of the source), this can be simplified to 
V" 1 P(^k+l|^k ~ m ) P(d k “ b^k = m.R, ) 


P(Rk+i Rf) 


P(Rh 


( 2 . 11 ) 


P k ( m ) 

where the first term is recognized as pk(m). 

The second term can be rewritten as 

P(d k =i,S k =m,Rf) P(d k = i,S k = m,Rf~‘,R t ) _ 
P(Rf) PCRf-'-Rk) 


P(d k =i,S k = m,R k |Rf-') P(Rf-') 


P(R k |Rf‘‘) P(RD 

The first equality comes from the time independence of the observation and the second from 
(2.9). The second term can now be canceled. Notice that expanding the denominator over 
all possible states m = 0...M-1 at time k-1 and all possible inputs i = 0, gives 

P(R k |R, k -' ) = £ X p ( d k = i. ■ Si = m - R k IR?' 1 ) • (2. 1 ■ 3) 

m i 

So the proof continues simply with the numerator which can be expanded over all states 
m’= 0...M-1 at time k-1 as follows. 


, k-l 


( 2 . 12 ) 
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Again using the time independence of the observation results in 
~*'S k =m,S k _j -m ,R k ,Rj ) ' p^k-t ^ 


and with (2.9) this becomes 


= £p(d k = i,S k = m,R k |s k _, =m',Rf-') 


_ P(S k .,=m',Rr l ) 


P(Rf-') 


a k ,(m') 


= Jp(d k = i,S k = m,R k = m|S k _, =m',R| c " , )*a k _ 1 (m') = 

m* 

Now the substitution of the second term with 0 Ck.|(m’) and the first term with Yk(Rk,m,m ) 
gives 

= P(d k =i,S k = m,R k = m|S k _, = m')-a k „ 1 (m' ) = 

7 i (R k ,m,m') ~ 

in' 

Equation (2.1 1) can now be expressed with (2.13) as 

p(d k =i,S k =m,R k |Rf-) lY,(R t .m.m')a k -,(m') 


p(R k |R, k ‘‘) 


m i m 


Interchanging the sums over i and m’ is allowed since otk-i(m) does not depend on i, yielding 
^Y i (R k ,m,m , )a k _ 1 (m' ) 


_ m* 


Z X X y> ( R k - m ’ m ' ) a k-i ( m ' ) 

m m* i 

Combing (2.1 1) with (2.10) gives 


(2.15) 


JjT' i (R l ,m,m l ')ci H (m' )(3 k (m) 

p(d k = i|R r 1 ) = m 


(2.16) 


m m i 


Finally, setting i =0 and i =1 and dividing gives 

p(d k =l|R") IX Y 1 (R k ,m',m)a k _ ] (m' )P k (m) 

A(d k ) ~ ^ p(d k = 0|R ” ) _ 1° § X X Y 0 (R t , m' , in) • a k _, (m’ ) • p k (m) (2 ' 1 7) 

m m' 

since the denominator of (2.16) is in both cases equal. 


□ 



Difference with the original Turbo Code paper [1] 


In the original Turbo Code paper [1], equation (2.14) was also expanded over all possible 
previous data inputs d k ,i =0,1. This leads to ot k (m) depending also on the input d k .j = i. 
This is unnecessary, as shown in [7], since a k (m) + a k (m) is equal to the 0Ck(m) defined in 
( 2 . 6 ). 


2.1.3 Recursive calculation of Ok(m) and Pk(m). 

The recursive calculation of Ok(m) and p k (m) is now derived. 

Ok(m) was defined in (2.6) as 

a k ( m ) = P(s k = m|R, )= P(S k = m, R, ) • • 

The second term of (2.1 1) can be obtained by expanding (Xk(m) as defined in (2.6) over the 
possible inputs 

P(d k = i,S k = m,R, ) 

= X dTtdTT • 


Substituting the term in the sum with (2.15) results in a forward recursion for (Xk(m) given 
by 


X2>,(R k .m ,m' )a k _,(m') 

“ k(m) = 

m m’ i 


(2.18) 


with the initial conditions Oo(0) = 1 and Oo(k) = a^k) = 0 ,V k * 0 and Vn^O. 


Expanding p k (m) as defined in (3.2) over all possible states m = 0...M-1 and all possible 
inputs i = 0,1 at time k+1 gives 



M m ) 


y, 'y'j p(^k-H ~i*Sk+i ~ m ’^k+2 

p(R^,|rO 


Rk.i|S k =m) 


where again the time independence of the observations was used. 
Using Bayes’s rule, the numerator becomes 


m’ i 


“ m)*P(d k+ j “hSjj+j — m ,R k+J |S k m) 

Yi(Rk.i.m,m’) 


The backward recursion for p k (m) is then given by 

XlY i ( R k+ „ m ,m')P k+ ,(m’) 

P t (m)= m ' ‘-A - i (2.19) 

,(r k +i » m, m ) ■ GCjj (m ) 

m m' i=0 

with the initial conditions p N (0) = 1 and p N (k) = Pn(k) = 0 ,V k * 0 and Vn^O. 


2.1 .4 The additive character of the MAP algorithm 

Looking at (2.17), (2.18) and (2.19) it is clear that the transition probability Yi is the most 
important term. 

Equation (2.9) can be rewritten as 

Yj(R t ,ra’ ,m)= p(R k |d k = i,S k =m,S k _, = m')-P(d k =i|S k = m,S k _ l =m') 
•P(S k =m|S k _,=m') 

As will be show in section 2.2.3, R k consists of y^ , the systematic information sequence, 
and y k , the parity sequence 1 , which are independent of each other. The first term can 
therefore be written as a product of the two received bits. That is, 
p(R k |d k =i,S k =m,S k _, = m')= p(y‘|d k = i,S k = m,S k ., = m’) 

•P(ykl d k =i . s k= m - s ic-. = m ’) 


1 In 2.2. 1 we shall see that x k p consists itself of two parity sequences. 



i -(y k p -omfP(i.ni , .ni)) z 

where p(f) = — r=-e 2<s is the Gaussian distribution (we consider here the 

2va 

channel to have additive white Gaussian noise, see 2.2.3), and 

om^ p (i,m',m) 

is the modulator output (see 2.2. l.d). 

Since y k is independent of the states m, it can be taken out of the logarithm directly. 
Defining the a priori probability 


P(S k = m|S k _, = iri ) = 


,A(d k ) 


P(S k = m[S k _, = m' ) = 1 - 


l + e A,d ‘> 

A(d k ) 


when P(d k = l|S k = m,S k _, = m') = t and 


1 + e 


A(d„) 


when P(d k = 0S k = m,S k _ t = m’) = 1 


leads to 


A(d k ) = — -y-y’ +A(d k ) + log 



X X^(y S ■ m ’ - m ) ’ “n ( m ' ) ’ Pt ( m ) 

m m* 

Z X ^o(y ’ m' .■ m) • a k _, (m' ) • P k (m) 


A.(d k ) 


( 2 . 20 ) 


with 


Yj(y£,m’,m) = p(yE|d k =i,S k =m,S k _, =m’)-P(d k =i|S k =m,S k _ l -m'). 


The first term in (2.20) is the received channel output multiplied with a constant Kc 
depending on the channel. The second term is the a priori probability of the bit k being a 1 
or a 0. The third term, the so called extrinsic information, can be considered as the updated 
a priori information for a possible second decoder. Together, these terms give the a 
posteriori probability A(d k ). Figure 2.2 shows the structure of a soft input/soft output 


decoder. 



A priori Prob.. 
A(d k ) 

Channel Output_ 
K c -y 5 k 


Soft-Input 

Soft-Ouput 

Decoder 


A posteriori Prob. 
A(d k ) 

Extrinsic Info. 
Ae(dk) 


Figure 2.2: Structure of a Soft-Input/Soft-Output decoder 


2.2 The Turbo-Coding System 


In this section, we discuss the three main parts into which a communication system can be 
divided: Namely the transmitter, the channel and the receiver as shown in Figure 2.3. 


Transmitter 


Channel 


Receiver 


Figure 2.3 - The three main parts of the Turbo-Coding system 


2.2.1 Transmitter 


The transmitter itself can be divided into five parts: the source, the encoder, the puncturer 
and the modulator as shown in Figure 2.4 


Source 


Encoder 


♦ 


Puncturer 


Modulator 


To the 
channel 


Figure 2.4: Transmitter 


a. Source 

The source produces the message to be transmitted. For the simulations reported here, a 
source generating a random binary sequence was used. In order to avoid difficulties caused 











by a poor random number generator the Data Enscritption Standard (DES) random- 
generator from [17] was used. 


b. Encoder 

The encoder consists of an interleaver and two binary, recursive, systematic, convolutional 
encoders (RSC), which henceforth are referred to as component encoders 1 and 2. One 
therefore operates in GF(2) and all additions are EXOR operations. 

A non-uniform interleaver of size 64000, which is filled up row by row and read pseudo- 
randomly (Fig. 2.5), is used in the TCS. Although interleavers are well known, they are of 
central importance in a TCS. Chapter 4 is dedicated to a more detailed presentation of 
interleavers and their influence in a TCS. 


Input 

[1,2, 3,..., 12] -> 


Figure 2.5: Nonuniform interleaver 
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Output 

— » [7, 2, 3,11,1,12, 4, 5, 6, 9, 8, 10] 


As proposed in [1], the component encoders are identical, although this is not necessary in 
general. These component encoders are defined by their rate R, the number of states 
(M=2 m ) determines the constraint length n A = (m + 1) and visa- versa, a feedback generator 
Gi and a feedforward generator G 2 . The first component encoder operates directly on the 
information to be transmitted, while the second component encoder is fed with the 
interleaved information sequence. The output of the first component encoder consists of the 
systematic sequence and the parity sequence, while the output of the second component 
encoder consists of the parity sequence only. The systematic sequence of the second 
component encoder is not transmitted (Figure 2.6). 





Figure 2.6: Encoder 


As mentioned before, the same encoders and interleaver as in- [ 1 ] have been implemented. 
The two component encoders have constraint length n A = 5 , generators Gi = 37 and G 2 = 21 
and are used in a parallel concatenation (Figure 2 . 7 ). 

The data sequence d is the input. The first output sequence x ls is equal to the input 
sequence because of the systematic structure of the first component encoder. The second 
output sequence x ,p is the parity sequence from the first component encoder. This is given 


by 


rvi vn 

x ,p =[x[ p ,X2 P ,...,x| c p ,...,x l N p ] with x ! k p = £gf - a Li and a k = d k +Xs- ‘ a L 


i=i 


where G 2 =(gf,g2,g3,gi) = (10,0,1) is the feedforward generator and 


Gi=(gf ,g2,gj,gj) = (U,U) is the feedback generator. 

The third output sequence X * 9 is the second parity sequence, generated by the second 
component encoder. The parity sequence x 2 * 5 is obtained in the same way as x Ip with the 
difference that the data d 1 on which the second component encoder operates is interleaved. 
Thus, 


x 2p = [x 2p ,x 2p ,...,x 2p ,...,x 2p ] with x 2 k p =£gf-aL and a 2 =d k +£g- -a 2 ^ , 


i=l 


i=l 


where again (gf.g2.g3.g4) = C 1 AO, 1) , (gf.g^gJ.gS) = (U,U). 


b _.b _b _b • 





d=*[d,,d, d„J x- =(*:.*? *il 



The encoder produces three bits, x I k s ,x 1 k p and x k p , for each input bit, and thus the rate of 
the encoder is 1/3. 

Both component encoders are assumed to start in the all zero state: i.e. a[ andaf are zero. 
Due to the interleaver the termination of both component encoders at the same time turns 
out to be difficult. At this point, only the first component encoder is terminated and the 
second is left ‘open’. As a consequence, the initial condition of p N (k) in the second, non- 
terminating trellis has to be changed from (2. 1 9) to 

p N (k) = a N (k),V k = 0...M-1. (2.2 1 ) 

A solution to this problem was not given in [1]. However, it will be shown in chapter 4 that 
terminating both trellises is possible for certain classes of interleavers. 


c. 


Puncturer 


In order to achieve a higher global rate than R = 1/3 the parity sequences x lp and are 
punctured according to a puncturing matrix P. For a global rate of R = 1/2, P is chosen as 

O (A 


P = 


0 1 


( 2 . 22 ) 


and the punctured ‘parity bit’ sequence is 

v pp — r Y 'p Y 2p Y ip Y 2p i 

A l A | * a 2 » ■ • * * a n~P ^ N J ’ 

This means that x! p is sent at every odd time and xl p at every even time k. 






d. 


Modulator 


The modulator assigns to every bit of the incoming sequences x^, x 1 a value corresponding 
to its level. We chose the same modulator as in [1] which assigns a +1 in case of a 1 and -1 
in case of a 0 according to (2.22) 

xm pp = 2 • x pp - 1 

xm k z x k (2.23) 

xm k = 2 • x k - 1 

where xmf and xm k are the outputs of the modulator at time k. 


Figure 2.8 gives a detailed overview over the transmitter 



2.1.1 Channel 

The channel for x ,s and x^ is the additive white Gaussian noise (AWGN) channels where an 
independent noise sample n 1,2 with zero mean and variance a 2 2 = No l2 / 2 is added to each 
transmitted bit. The channel inputs are denoted by x and the channel outputs by y (Figure 
2.9). 
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y k = x k + n k 
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Figure 2.9: Channel model 


2.1.2 Receiver 


The receiver provides the Turbo Code with its name and its performance. Analogous to the 
transmitter, the receiver can be divided into a depuncturer, a decoder and a sink. The 
demodulator is part of the decoder (Figure 2. 10). 


From the 
channel 



Figure 2.10: Receiver 


a. Depuncturer 


The depuncturer separates the (demultiplexed) parity bit sequence into the two parity bit 
sequences y lp and y 2p and substitutes a zero for the punctured bits (Figure 2.11). 



Figure 2.1 1: Depuncturer 


b. 


Decoder 





The decoder consists of two deinterleavers, two interleavers, a final harddecision decoder 
and two softinput/softoutput decoders which are denoted as component decoder 1 and 2. 
Both component decoders use the modified MAP algorithm. 

The first component decoder is fed with the a priori probability A lm , the information 
sequence y ls and the parity sequence y lp . The result of the first component decoder is A lout , 
the extrinsic information which is independent of the a priori information and the systematic 
information for the corresponding bit d*. At the output of the first decoder, the a posteriori 
probability (APP) A loul , is not of interest, since the decoder has not made use of y 29 which 
may improve the APP. 

Feedback 



Figure 2.12: Decoder 

The second component decoder is fed with A 2m (which is equal to A Iout deinterleaved) as 
the a priori probability, x ls deinterleaved and x 2 **. The de interleavers are necessary to ensure 
that x ls , A 2in at time k corresponds to x 2 * 5 at time step k. The second component decoder 
produces the new extrinsic information A 2ou ‘ and the a posteriori probability A, since both 
parity bits are now used. 

This procedure is known as iterative decoding. The two component decoders are 
suboptimal because the first component decoder uses only half of the available redundant 
information (x ,p but not x 2 * 5 ). As a consequence, an improvement in performance can be 
achieved through the use of a feedback loop (A lin = A 2oul ). In the simulations up to 18 
iterations were used as proposed in [1]. This means that for every information bit dk both 










component decoders are used 18 times. The term Turbo Codes is given for this iterative 
decoding scheme using feedback with reference to the turbo engine principle. 

After 18 iterations, the APP of the second component decoder is led to the final hard- 
decision decoder which decodes d e according to (2.4). 


c. Sink 


In the simulation program, the sink is simply represented by a procedure for counting the 
decoding errors. 


The achieved results with the algorithm are shown in Figure 2.13. The performance is the 
same like presented in [1]. 
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Figure 2.13: BER of Turbo Code with a 65536 bit interleaver 


Some modifications on the encoder and interleavers are made in the following chapter. But 
the structure of the algorithm will not be changed. 




3 Convolutional Codes 


The turbo code uses a recursive encoder to produce its parity check bits. This chapter 
discusses the structure of the trellis used in the turbo-code, which is important to 
understand design considerations for interleavers. 


3. 1 General aspects of recursive convolutional codes 


The turbo codes are realized with convolutional encoders in systematic feedback form. Such 
encoders always have a feedback and a forward path as shown in Figure 3.1. The encoder is 
given by its generator polynomials in octal numbers, e.g. (37,21). The first number defines 
the feedback and the second one the forward path. 



K±) ► [x/ p , x,V. O 


Figure 3. 1 : Standard turbo encoder (37,2 1 ) 


The constraint length is defined as 

n A = m+ 1, (3.1) 

where the memory order m is the length of the shift register. The constraint length is not as 
imprtant for recursive encoders, as it is for feedforward ones. Another parameter, the so 
called cycle length nc yC | C is more important for recursive codes. The cycle length is best 
defined by considering the encoder output sequence for a base information sequence. A 
base sequence is an information bit sequence in which only one bit of the vector is equal to 
1, e.g. d=[0, 0,0,1, 0,0,0, ..,0,0]. The corresponding output sequence shown in Figure 3.2, has 
a repetitive structure. The length of this repetitive sequence is called the cycle length. 






[001 000 0 00000 0 .. . 000 ] — ^ 


Encoder 


[00 1 1 00 1 0 1 00 1 0 1 00 1 0 ...] 
rr ' 


Figure 3.2: Output vector of a base sequence 

The cycle length is given by the feedback path of the encoder and is bounded by the 
memory order by 

m^n^^-l (3.2) 


In the case of the turbo code with generators (37,21), the cycle length is n^ = m + 1 . 

The cycle length is important in the design of the interleaver, as will be shown in chapter 3.3 
and chapter 4. 


3.2 Structure of the Turbo Code feedback shift 

register 


The structure of the systematic feedback encoder is important in the design of good 
interleavers and in the achievement of better performance. As described in the previous 
chapter, a recursive encoder has a repetitive output sequence. A similar cyclic process exists 
in the content of the encoder’s memory. 

Figure 3.3 shows the typical structure of the turbo code encoders, where all feedback taps 
are equal to 1. 


[^Ni ^N-1* 



Figure 3.3: Structure of a recursive encoder, where all feedback taps are set 





The content of the memory of the encoder shown in Figure 3.3 is given by 
S»(n) = £l k (n)-d k 

lc=l 

with l k (n) = ((n-k) mod(m+l) =o)®((n- k) mod(m + l) = l) 

= ((n-k)mod(m + 1) = o)®((n- k - 1) mod (m + 1) = o) 

and where 
S, +l (n) = Sj(n-l) 

m: Memory order of encoder 

So(n)..Sm-t(n): Content of the memory cell at time n 

dk: encoder input bit at time k 


(3.3) 


(3.4) 


Example: 

The information sequence for an encoder with feedback polynomial (37) is 
d=[0, 1 , 1 ,0, 1 ,0,0,0, 0, 1,1,1]. According to (3.3) and (3.4) this will lead to the following 


contents of the encoder memory: 
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Induction proof of (3.3) 


Initialization condition: l k (n) = 0 V n < 0 or n < k, S 0 (n) = 0 V n < 0 and because of (3.4): 
S. (n) = 0 V n < i and the first incoming value: S 0 (1) = d , , 


Initialization test: 

l t (1) = ((1 — 1) mod (m-t-l) = o)^((l — 1) mod ( rrn- 1) = i) = 1 00 = 1 Assumption is true 
for the Initialization 


Recursion: 

/ m— 1 


S 0 (n) = 




( 3 . 4 ) 


XS/n-l) + ld„ 

Vi=° > 

£S 0 (n-l-j) +1 . dj 

Vi=o J 

^m-l n-l-j ^ 

X S 1 k( n-1- j) d k 

\ j = Q k=l J 


+ l-d. 


^m-1 n-1 'N 

xiu ( n-i-j)-d k 

\ j=0 k=l 


/ 


m--l n-l 


XX 1 k (n — 1 — j) d, 


j 1 k n j_Q^ see ^ e | ow 

+ 1 • d„ 


+ 1 • d , 


^ n-l m-1 

XXl k ("-l-j)d k 

Vk=I j=0 
( n-l 

X((n-1 - k) mod(m + 1) =0)© ((n - 2 - k) mod ( m + 1) = 0) 

u=i 


• • • ®((n - 2 - k) mod (m + 1) = 0) • • • 

• • • ((n - 1 - (m - 2) - k - 1) mod (m+ 1) =0)® ((n - 1- (m- 1) - k) mod (m + 1) = 0) 
0((n - 1 - (m - 1) - k - 1) mod (m + 1) = 0)) • d k + 1 • d n 


/n-l 


- 1) mod(m+ 1) = 0)© ((n-k)mod(m + 1) = 0)*d, 


\k=l 




+ ((n - n — 1) mod (m + 1) = 0) 0 ((n - n) mod (m + 1) = 0) • d n 

n 

= X((n-k - 1) mod (m + 1) = 0) © ((n - k) mod (m + 1) = 0) • d k 

k=l 



Remark: 


SH U(n-l-j) d k 
i=l k ”°~ J =0, see below 


= 0 , because 


l k (n) = OVn<k, here: n-l-j<n-j and n — 1 — j < n 


a 


The corresponding cycle length for recursive encoders where all feedback taps are set is 
n c>c!e — ni + 1 . This can cosily be seen for a base sequence. The contents of the encoder 

memory are equal every m+1 time units and therefore the cycle length has to be m+1. 


3.3 Improved trellis for the Turbo Code scheme 


The choice of the generator polynomials for a given memory order m influences the 
performance of the Turbo Code. 

Calculations of the distance spectrum as well as simulations for pseudo random interleavers 
have shown that the cycle length of the linear feedback shift register should be 2 m - 1 . This 
maximal cycle length can be achieved with a primitive feedback polynomial. In a subsequent 
chapter, there is an explanation of why it is crucial to use maximum cycle length shift 
registers. 

The feedback shift register of a maximum cycle length code has another advantage. 
According to [16] it is not possible for a base information sequence that all of the ones of 
the output are on even or odd positions. This means that it is not possible to cancel all the 
ones with puncturing. This would reduce the distance dramatically. 

While the feedback path is mainly responsible for the cycle length, the feedforward 
polynomial has to to be choosen such that catastrophic encoders are avoided. If the case of 
catastrophic encoders is excluded, it is not so important which feedforeward generator is 
used. 



3 . 3.1 


Distance spectrum 


A change of the feedback path has an influence on the distance spectrum of the code. 
Unfortunately, it is very difficult to calculate the whole distance spectrum ot the Turbo 
Code system given in [1], because the calculation time for such big interleavers is very high. 
Therefore, the effects for interleavers of size 32 are considered to illustrate the basic 
principles. 


The following figure shows the distance spectrum for an encoder with memory three and 
with different feedback polynomials for an interleaver of size 32. As described above, the 
best results can be achieved with maximum cycle length feedback shift registers. 

Distance spectrum 



Figure 3.4 - Distance spectrum of turbo codes with different feedback paths (Blocksize: 32, Iterations: 10) 




4 Interleaver 


Interleavers are commonly used on channels with memory so that they appear to the 
channel decoder as memoryless. In most coding systems, they have no influence on the 
performance, if memoryless channels are used. This is not the case in the turbo coding 
scheme. In this scheme, the interleaver is an integral part of the encoder and can 
dramatically effect the performance even on memory less channels. 


The interleaving process scrambles the order of the code symbols d n in time while the 
deinterleaving process unscrambles the recovered symbol stream into the original sequence. 


[d 1 ,d 2 > d 3 l d 4 ,d 5 ,d e ,d 7 ,dj 

[d6«d2id 7 ,d 3l d 8 ,d 1 ,d 4f d 6 ] 



[d 5 ,d 2 ,d 7 l d 3 ,d 8 ,d 1 ,d 4 l d e ] 


[d 1 ,dj,d 3 ,d 4 t d 5 ,d 8 ,d 7t dJ 


Figure 4.1: Interleaving process 


This chapter discusses different kinds of interleavers and gives some rules for building good 
interleavers for use with Turbo Codes. 




4.1 Interleaver design considerations 


The original turbo code [1] uses a pseudo-random interleaver. For this class of interleavers, 
the only condition to the interleaver function G(i) is that the transformation is bijective. 



information bits 


pseudo random premutation 
of information bits. 


interleaved information bits 


Figure 4.2: Pseudorandom Interleaver 


The transformation itself is a pseudorandom process, which is defined once and does not 
change during the encoding / decoding process. Pseudorandom interleavers follow no 
special rules and can therefore result in a transformation, which leads to a low Hamming 
distance. The following figure shows the error correcting capability of the Turbo Code, 
using a typical pseudorandom interleaver of size 400. 
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Figure 4.3: Random Interleaver (Blocksize: 400, E^/No: 1 .9 dB) 


Most of the errors can be corrected through the iteration process. However, there are some 
errors which cannot corrected and ‘fall through’. It is significant that error patterns of odd 
weight seldom occur and that when an error occurs it is with a high probability a two bit 
error. A brief look at the bit error probability on different inlerleaver / block positions 
shows that the bit errors are concentrated in certain positions. 
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Figure 4.4: BER on different bit-positions within a block 



The positions with a high bit error rate are given by a special interleaver symmetry, which 
leads to a small weight for certain codewords. An information bit sequence with just two 
ones is an example for such a case. 



Figure 4.5: Encoder structure 


If dj and di+cycie are equal to one and all other bits are zero, then x lp , the output of encoder 1 , 
has only two ones after puncturing. A worst case interleaver will just move both information 
bits by n. Then there will be only two parity check bits in x 2 * 5 , the output of encoder 2. The 
resulting distance from the all zero codeword is only 6. 
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Figure 4.6: Turbo-code with small free distance 


A turbo code using the bad interleaver shown in Figure 4.6 has poor error correction 
capabilities. This kind of interleaver cannot avoid that there are many two bit error, called 
errors of order 1, in the code. Errors of order n are defined as the number of two by two 
errors in a block (e.g. Error of order 2: 2 pairs of errors). 

Applying the following rules to the design of the interleaver decreases the probability of 


errors of order one and two: 





Maximize 


min(|z - j\ + |G(i) - CJ( j)|), V /, where 
i ± j,i mod = j mod , G(i) mod = G(j) mod 

(4.1) 


i j are the positions of the switching bits and G is the transfer function of the interleaver. 


:reasing errors of order 2 


Maximize 

min^i i - i 2 1 + 1 j , - j 2 1 + |G (i , ) - °(j , ]( + |G (i 2 ) - G(j 2 )| - Ov(i, j) - Ov(G(i), G ( j))), 
WpVji-jz. where i, *i 2 # j, * j 2 ,i, mod n c)cte =i 2 mod n C)tle , 
j, mod n c)ck = j 2 mod n C)Cle , G(i, ) mod n C)C , e = G(j, ) mod n C)Ck , 

G(i 2 )mod n wle = G(j 2 ) mod n c)clc 


The Overlap Ov(i,j) is defined like showed in Figure 4.7. The example showed in Figure 4.8 
has no Overlap. 



Overlap 

Figure 4.7: Overlap in Error of order 2 
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Figure 4.8: Bad interleaver for 2 nd order errors 









Applying these rules will encrease the performace. Unfortunately, it is only possible to 
decrease errors of order up to two with a reasonable effort on computational power. Figure 
4.9 shows an interleaver where the minimum of (4.1) and (4.2) is 30. 
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Figure 4.9: Error correction capabilities of improved interleavers (Biocksize 400, Eb/N 0 1 .9 dB) 


The influence of the interleaver improvement on the two first iterations is small. But with an 
improved interleaver, there are less errors which can ‘fall trough’ and this improves the 
performance. 

The resulting free distance of the turbo code is a function of the trellis as well as the 
interleaver. Simulations have shown, that optimized 400 bit Interleavers are better than 
interleavers of size 1000 for a signal to noise ratio above 2 dB. 



Interleavers of size 400 
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Figure 4.10: BER of improved interleavers 


4.2 Interleaver with maximum cycle length encoders 

The improvement rules explain why it is important to use a maximum cycle length encoder. 
According to (4.1) a bad transformation is given if i mod n cycle = j mod n c>clc and 

G(i) mod n cycic = G(j) mod n CiCle . Therefore the probability of choosing a bad interleaver 

transformation for the element j is 

( 4 - 3 ) 

“ cycle 

In order to minimize bad interleaver transformations one has to maximize the cycle length. 
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Figure 4.1 1: Influence of maximum cycle length code to the performance 


Another way to improve the average weight of a crucial codeword is to increase the size of 
the interleaver. If a bad interleaver transformation is given, then the average weight of the 
parity bit sequence is 


N' 


d = w 


w 


X-l 

'n' 3 -N') 



f 3 ( 


where w is the number of ones in a cycle w = - - - - - and N' = , where N is the 

2 ^c>cle 


interleaver size. So the average weight of the parity bit sequence is 
f „ \ , 


d =~ n cyc le 


N 


+ 1 


n. 


= l( N + n cl.) 


(4.4) 


V" cycle J ^ 

The observation that a bigger interleaver size will lead to better performance has been made 
several times in the literature[7]. 


4.3 Distance spectrum 


The distance spectrum of the code is partly a function of the interleaver. Calculations of 
distance spectrums for small interleaver have shown, that the average distance remains the 
same, even the interleaver is improved. With an improved interleaver the variance of the 
distance spectrum decreases and the minimum distance is increased. The higher free 
distance lowers the error floor. This explains why a performance increase can only be 
shown for a signal to noise ratio above 2 dB in Figure 4.10. For lower SNR’s the whole 
distance spectrum is responsible for the performance and, because the average distance does 
not change, an improvement cannot be expected there. 


4.4 Terminating interleavers 


One of the problems with the Turbo Code scheme is, that it is not possible to terminate both 
encoders with a general interleaver. The original MAP algorithm cannot be used for the 
second encoder. This will lead to slightly lower performance especially for small 
interleavers. 

There exist different interleavers that terminate both encoders for every 

interleaver size N, where N is a multiple of nc yc ie- For these block sizes, a permutation of 
elements such that 

i mod = G(i) mod n ocle Vi (4.3) 

where G(i) is the interleaver function, has no influence on the content of the memory cells at 
the end of the block. Therefore both trellis will terminate. 



\ / 

Permutations only under elements of the same block 


Figure 4.12 - Permutations for terminating interleavers 



The following proof is valid only for encoders where all feedback taps are equal to one (See 
Figure 3.3). The terminology is the. same like in (3.3) 

If N mod (m+l)=0 the contents of the memory cells can written as follow: 

N N 

Sj(N) = S<» w «,w + VO < j < m (4.4) 

ic=l k=l 


Proof: 

N 

nH-1 

Sj(N) — ^d k . (m+1) _j +dk(m+l)-H 

k=l 

N 
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k=l 
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= 2(( N “ k “ j) mod (m 4- i) = 0)d k © (( N - k - j - J) mod(m + !) = °)d k 
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N 
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N -j 

=» Sj(N) = ^((N-k- j)mod(m+ 1) = 0)d t ®((N - k- j - l)mod(m+ 1) = 0)d k 

k=l 

N 

+ I((N-k — j) mod(m+l) = 0)d k ©((N-k- j-l)mod(m+l) = 0)d k 

k=N-j+l ^ 

=0, because of Initialization 

= S 0 (N-j) 

Addition is a commutative operation, so a change in the order has no influence on the result 
and both encoders will be in the same (zero) state at the end of the block. The second 
encoder terminates if and only if i mod n cycIe = G{i) mod n cycle V/ , where G is the 

interleaver function. 


It is possible to build optimized interleavers under these conditions. However the 
performance of terminating interleavers will be worse in general because of the reduction in 
the freedom in placing the elements. Simulations have shown, that it is only worth to make 
self terminating interleavers, if their size is less than 1000. 


A special class of interleavers, where both encoders are terminated, are helical interleavers, 
as discussed in earlier works [8] [9]. No interleaver optimization will be made for this class 



of interleavers, so that the axis ratio of the helical interleaver is crucial to the performance. 
The performance of the Turbo Code with a good and a bad helical interleaver is show in the 
following two examples. 



Figure 4.13: Helical interleaver of the size 1 190 with different axis ratios 

The helical interleaver with the axis ratio of about 1 clearly gives the better results. A look 
at the distribution of the errors within a block shows that the errors are concentrated on a 
band by the 1 19*10 helical interleaver 



# of errors 

40 80 120 160 200 240 280 320 360 

Figure 4.14: Errors in different block positions (1 19*10 Helical interleaver, Eb/N 0 : 1 dB) 


The error band corresponds with the symetry of the helical interleaver. Calculating the 
distance given by (4.1) and (4.2) will show similar bands. 
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Quantization 


Loss of error correcting performance due to quantization is well known. This loss can be made 
negligible by using an appropriate number of quantization levels. An eight level quantization 
scheme has turned out to be sufficient in practice. 

However, all of the simulation results in the Turbo Code literature were obtained without 
quantization (to the best of the authors. Therefore, the question arises whether quantization 
has an unexpected influence on Turbo Codes. That is on the performance improvement with 
increasing number of iterations. 


5.1 Optimized Quantization 


For many applications, the results achieved by equal spaced quantization is sufficient. To get 
better results and to say something about the quality of equal spaced quantization, it is possible 
to apply the optimized quantization presented by Massey [13]. He showed that if the 
likelihood ratio is defined as 



Jpv(y|o) 

91 

Jpv(y|i) ’ 


then the optimum values for the likelihood thresholds are 
T j =^('R j -,).^ j )Vj. 


( 6 . 1 ) 


(6.2) 


For an AWGN channel the probability density function of the received signal is given by 

f / , \2\ ( - • — 


Pv(y|°) 


1 




exp 


(y-VC) 


N, 


’ p * (y|1)= VFiC xp 


(y+VC) 
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( 6 . 3 ) 



Equation (6.3) can be normalized and written as 

p ^°) = i exp f~^rH ^ = i exp 


(y + a) 




E 2 

.where a = 2 — - and s = — y 


N, 


N, 


With the following iterative procedure it is possible to find the optimized likelihood thresholds 
Tj and quantization thresholds Ij. The example is made for an n-Level quantizer. The 
initialization conditions are I 0 = I n = T 0 =0 and T n = °° 


1. 

2 . 


Choose Ii arbitrarily. 
Calculate 


]p s (s|0)ds 


M*o) = f 


Jps(s|l) ds 


3. Find h using (6.1) and (6.2). First calculate 


M*,) 


T2 


then find I 2 from 


'|ps(#fc 


*(*.) = 7 


|p s (s|l)ds 


4. Find I3 through I n .i in the same manner by repeating 2 and 3 

5. Calculate 

if choose a smaller value for II and return to 2. If A.(9t n _j)< 

choose a bigger value for II and return to 2. Stop if ) • 

The conversion between the likelihood and the quantization thresholds is done in the following 


way 



~/jr ex P 

Ps(s|0) V* 

PsOW i 

-==exp 
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5.2 Simulation Results 


The quantization results scheme affects the performance of the code. This depends on the 
number of quantization levels and the appropriate choice of them. 
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Figure 5.1: Comparision of different quantization methods on different iterations for 32-level quantization 


Figure 5.1 shows simulation results where different quantization methods are compared for 
several iterations. The first method is called the ‘Equal-Space’ method where the differences 
between all quantization levels are equal to 1/16 for 32-level quantization; i.e. they are 












independent of the Signal-to-Noise-Ratio (SNR). The second method, called ‘optimal 
quantization*, is also a 32-level quantization method but the levels are now dependent on the 
SNR, as proposed in [4]. As a comparison, unquantized decoding was used to obtain the 
results of the blue graph. The best results are achieved without quantization. Choosing the 
quantization levels as a function of the SNR minimizes the loss of performance caused by 
quantization. 

These observations can be summarized as follows. The simulations results show that 
quantization leads to the expected performance degradation. It is interesting that this shifting is 
fully maintained throughout all iterations. 



Figure 5.2: BER function of the Turbo-Code with floor (dashed without floor) 

In conclusion it can be said that quantization has no unexpected influence on Turbo Codes. 
However, one detail has to be mentioned. Since the BER-function of the Turbo Code has an 
error floor, a small dB loss results in much greater BER loss in zone B than in zone A (see 
Figure 5.2). 
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Improvements on the Turbo Coding 
scheme 


In the original turbo-code scheme the output information sequence is coming out of the 
decoder two. This has the disadvantage that the output is based on a nonterminating trellis, 
if a general interleaver is used. By changing the decoding order and decoding the open 
trellis first, it is possible to improve the performance. 

Feedback 
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Figure 6.1: Modified Turbo-Code scheme 


The change of the decoders has the second advantage of a decreased complexity, because 
no deinterleaver is necessary for the output. 

The following figure compares the performances of the two decoder positions. The same 
interleaver, trellis is used for both decoding schemes. 







7 Conclusions 


We have explained the structure of the turbo-code scheme. The results claimed in [2] and 
[7] can be reproduced with the simulation programs except for the optimized 1024 bit 
interleaver from [7], 

It is possible to increase the performance of the turbo-code by using maximum cycle length 
linear feedback shift registers in the encoder. The use of them decreases the probability of 
low weight code words. 

An interleaver modification rule was presented which allows the construction of better 
codes. In combination with maximum cycle length linear feedback shift registers better 
results could be achieved than presented in [7]. 

Finally, we have given the class of interleavers which allows that both encoders of the 
turbo-code ends in the same state. 
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Appendix B - Simulation results 


Bit error rate of Turbo Codes with different interleaver sizes 


Interleaver size: 65536, Generator polynomials (37,21), Memory order: 4 
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Interleaver size: 16384, Generator polynomials (37,21), Memory order: 4 
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Interleaver size: 1024, Generator polynomials (37,21), Memory order: 4 
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Simulations with quantization 

All simulations are done with an interleaver size of 400 and the generator 
polynomials (37,21). 


Quantization levels: 4 (Equal spaced) 
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1 Introduction 

In 1993, Berrou, Glavieux, and Thitimajshima [1] stunned the coding community with their 
ann ouncement of a coding technique, which they called “TURBO Codes”, that came within 
0.7 dB of the Shannon limit for the additive white Gaussian noise channel used with a spectral 
efficiency of 0.5 bits per dimension. These claims were initially viewed with some skepticism, 
but subsequent research [2], [3] has verified the performance of this new coding scheme. TURBO 
codes and similar schemes are now an active research area throughout the world. 

The TURBO Code utilizes a parallel concatenation of two identical punctured convolu- 
tional codes, whose encoders are realized in systematic feedback form, and a complex inter- 
leaving/deinterleaving scheme. The use of punctured systematic codes reduces the complexity 
of the decoder and allows higher spectral efficiencies to be achieved. The interleaver and dein- 
terleaver serve two purposes. First, they perform the traditional role of redistributing and 
decorreiating the bursty errors out of the decoder. Second, they insure that both encoders do 
not simultaneously output low weight code sequences. 

This encoding scheme is used in conjunction with an iterative decoding technique based on 
a modified version of the maximum a posteriori (MAP) algorithm introduced by Bahl, et. al. 
[4]. The MAP algorithm is used because it naturally produces reliability information about its 
decoding decisions. This “soft” information can then be used in the next iteration of decoding. 
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It is now widely believed that the primary contribution of Berrou et. al. [1], is the decoding 
algorithm and not the encoder structure. 

Though the performance of the TURBO code has been reproduced, several fundamental 
questions concerning their performance and structure remain. The goal of this project is to 
address some of these questions, enumerated below, and to develop a general framework in 
which the properties and performance of TURBO codes can be easily understood. To assist 
you in this undertaking, you will have access to a computer simulation program for TURBO 
codes that is being written in a concurrent semester project. This program is to be used as an 
experimental tool to test and verify your hypotheses and conclusions, not simply to generate 
performance curves 

2 Tasks 

1. Familiarize yourself with the literature on TURBO codes and prepare a short, informal 
presentation on the state of the art. 

2. Find the free distance of the TURBO code. This may be done by developing an appro- 
priate computer algorithm or by analytical methods, but the answer must be exact. 

3. Develop a relationship between the interleaver structure and the free distance of TURBO 
codes. 

4. Examine and discuss the relationship between TURBO codes and product codes. 

5. Explain the performance of the TURBO code in terms of its distance properties. In 
particular, the effect of iterated decoding and the presence of the error floor [2] should be 
addressed. 

6. Determine a meaningful measure of the decoding complexity of the iterated MAP algo- 
rithm. Use this measure to make an intelligent performance versus complexity comparison 
between TURBO codes and other complex coding schemes. 

7. Construct alternative coding schemes to the TURBO code. The program being developed 
in the semester project may be used to simulate these codes. 
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Abstract 


It is the goal of this project to explain the performance of Turbo Codes. An algorithm for 
determining the free distance of Turbo Codes is proposed and applied to a few examples. 
The origin of the error floor is explained. The method of random interleaving is corrected 
and extended to punctured Turbo Codes. The distance spectrum of Turbo Codes is inves- 
tigated using random interleaving. A theory is developed that explains the performance of 
Turbo Codes. The decoding complexity of Turbo Codes is reviewed. Constituent encoder 
optimization and interleaver design are discussed. 


Chapter 1: Introduction to Turbo Codes introduces the Turbo Code’s performance and 
the Turbo Coding scheme. 

Chapter 2: The Free Distance of Turbo Codes presents an algorithm to calculate the free 
distance of a Turbo Code. This algorithm is applied to a few examples. The origin of 
the error floor is identified. The problem of interleaver design for optimal free distance is 
formalized. 

Chapter 3: The Distance Spectrum of Turbo Codes explains how the distance spectrum 
determines the Turbo Code’s performance. In a first analysis, the influence of the inter- 
leaver on the distance spectrum is shown. The method of random interleaving is corrected 
and extended to punctured Turbo Codes. Theoretical results are derived that show how 
the distance spectrum changes as the interleaver size increases. A theory is proposed that 
relates the Turbo Code’s performance to its distance spectrum. 

Chapter 4: The Decoding Complexity of Turbo Codes reviews the decoding complexity. 
The decoding complexity of a Turbo Code is compared to the decoding complexity of a 
convolutional code with Viterbi decoding. 

Chapter 5: Improvements for Turbo Codes discusses constituent encoder selection and 
interleaver design. The good performance of Turbo Codes with maximum cycle length 
constituent encoders is explained. 

Chapter 6: The Relation Between Turbo Codes and Product Codes 
Chapter 7: Conclusions 
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Chapter 1 

Introduction to TURBO Codes. 


1.1 The Turbo Code’s performance. 


In [1], Berrou et al. claim to have build a code that comes within 0.7 dB of the Shannon 
limit for the additive white Gaussian noise channel used with a spectral efficiency of 0.5 
bits per dimension. Since these results were first published, several research groups have 
been able to verify this performance [5], [6]. Figure 1.1 shows simulation results for a 
Turbo Code with constituent encoder and puncturing pattern as defined in [1], and a 64K 
pseudorandom interleaver. These simulations were performed at this institute by Arnold 
and Meyerhans [18]. Their results confirm the claim made by Berrou et al: Only a signal 
to noise ratio of 0.7 dB is required to obtain a bit error rate of 10 

Performance curves of Turbo Codes show a typical shape, which is unlike that of any other 
code. First, their is a rather fiat part for moderate signal to noise ratios. In the literature 
[5], this part of the performance curve has been labeled as the ‘error floor’. Below a certain 
signal to noise ratio, the steepness of the curve suddenly increases. It is the combination 
of the error floor and the drastic increase in steepness that puzzles the coding community. 
In order to appreciate the performance of Turbo Codes, the Turbo Code’s performance is 
compared to that of a (2,1,14) convolutional code with Viterbi decoding. This convolu- 
tional code was build for use on the Galileo space probe and has a free distance of eighteen, 
see [19]. At a bit error rate of 10“ 5 the Turbo Code offers a coding gain of approximately 
2 dB compared to this complex convolutional code. 

Figure 1.2 shows how the performance of Turbo Codes evolves as the interleaver size 
increases. These Turbo Codes consist of the same constituent encoders and puncturing 
pattern as the code mentioned above. The interleavers are pseudorandom and of sizes. 
100, 400, 1024. Two effects call the attention: As the interleaver size increases, the error 
floor shifts downward and dominates the performance down to lower signal to noise ratios. 
The combination of these effects causes the performance curve to grow steeper as the 
interleaver size increases. 

It is the goal of this project to explain these effects. 
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Figure 1.3: The Turbo coding scheme. 



Figure 1.4: The Turbo encoder with puncturing pattern as in [1]. 


1.2 The coding scheme. 

Figure 1.3 shows a block diagram of the Turbo coding scheme. The source emits a block of 
information bits with block length N. T h i s information block is encoded and transmitted 
over a binary input additive white gaussian noise channel. The Turbo decoder produces 
estimated values for the information bits 
The encoder and decoder are now separately discussed. 

The Turbo encoder, see Figure 1.4, is a parallel concatenation of two recursive system- 
atic convolutional encoders, called the constituent encoders. The information block, which 
has a finite length, is encoded by the first constituent encoder, but interleaved before it 
enters the second constituent encoder. Note that the interleaver operates on an informa- 
tion sequence, and not on a code sequence. The interleaver size equals the information 
block length. The output of the constituent encoders is punctured to increase the spectral 
efficiency. 

Mi], Berrou et al. propose constituent encoders with generators (37,21). The systematic 
output of the second encoder is completely punctured. On even time steps the parity bit 
of the second encoder is punctured, on odd time steps the parity bit of the first encoder. 
This puncturing pattern leads to a rate of ^ f° r the Turbo Code. 

Berrou et al. have used pseudorandom 64K interleavers for their simulations. A pseudoran- 
dom interleaver is randomly generated, but then used unaltered throughout the encoder s 
operation. 

Berrou et al. do not mention how the constituent encoders are terminated. Since the 
encoders are recursive, it does not suffice to set the last M information bits to zero in 
order to drive the encoder back to the all zero state. The necessary termination sequence 
depends on the state of the encoder at the moment that the termination is to begin. Note 
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Figure 1.5: The Turbo decoder with puncturing pattern as in [1]. 


that, in general, driving the first encoder to the all zero state does not terminate the 
second encoder. This is due to the interleaver. It has become a standard approach in the 
literature [3] that the first encoder is terminated, and the trellis of the second encoder is 
simply ‘left open’. 

The encoder is used in conjunction with a suboptimal iterative decoding scheme based 
on a modified version of the maximum a posteriori (MAP) algorithm introduced by Bahl 
in [4], see Figure 1.5. The MAP algorithm is used because it produces reliability informa- 
tion about its decoding decisions. 

A MAP decoder has two inputs: the channel output and the a priori probabilities for the 
information bits. Its output consists of the a posteriori probabilities for the information 
bits. 

For the first iteration, the a priori probabilities W2k entering the first MAP decoder are set 
to 0.5. After interleaving, the a posteriori probabilities w xk produced by the first MAP 
decoder are fed into the second MAP decoder as a priori probabilities w XL . The second 
MAP decoder, however, does not output the a posteriori probability, but a slightly modi- 
fied quantity called the extrinsic information. The calculation of the extrinsic information 
is almost identical to that of the a posteriori probability. When the a posteriori proba- 
bility is calculated for bit dk with the a priori probability of dk set to 0.5, the extrinsic 
information is obtained. The combination of the interleavers and the extrinsic information 
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Figure 1.6: The (2,1,4) R SC encoder with generators (37,21), as proposed by Berrou in 
[ 11 - 


decorrelates w 2 k and (xk,yk) so that the iterative decoding is effective. Now the extrinsic 
information produced by the second MAP decoder enters the first decoder as a priori prob- 
ability. From the second iteration on, both decoders produce the extrinsic information. 
In this way, also w\i and y 2 i are decorrelated. After the last iteration, the second MAP 
decoder produces the a posteriori probabilities appi that are used to make a hard decision 
about the values of the information bits. In practice, the extrinsic information exchanged 
between the MAP decoders is in the log-likelyhood ratio form, see Section 1.4. 

For the simulations in [1], eighteen iterations are used. All the simulation results in this 
report are obtained with eighteen decoding iterations. 

Note that none of the Turbo Code’s components are new. The MAP algorithm was 
presented already in 1972. Iterative decoding was introduced in 1954 by Elias in [17]. 

1.3 Recursive systematic convolutional encoders. 

The constituent encoders of Turbo Codes are convolutional encoders in systematic feed- 
back form. Figure 1.6 shows the constituent encoder proposed by Berrou et al. in [1]. The 
generator sequence is (37,21). The first number, 37, defines the feedback path, the second 
number, 21, defines the forward path. 


1.4 The MAP algorithm. 

The MAP algorithm, introduced by Bahl in [4], calculates the a posteriori probabilities 
of the states and transitions of a Markov source observed through a discrete memoryless 
channel. Bahl applied the algorithm to terminated feedforward convolutional codes. In 
[1], Berrou modified the algorithm for recursive convolutional codes. The treatment of 
the MAP algorithm that gives the best insight for use with Turbo Codes is published by 
Andersen in [5]. Only the results are stated here: 
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The recursive systematic convolutional encoder has 2 ^ states St (M is the memory of 
the encoder), input dt and output Xt , t — 0, 1, — , L. L stands for the information block 
length. The encoder starts and ends in the zero state. The output Xt is a function of the 
input dt and the previous state St— i* The source is described by the transition probabilities 
Pr(St — m|5t_i = to') that depend on the a priori probabilities for the information bits 
dt. The output X t is observed through a binary input AWGN channel. The output of the 
channel is Y t . 

The following quantity is defined: 

7 = Pr(S t — m; lt|5t_x = m') 

= Pr(S t = m|5 t _x = m!) • Pr(Y t \S t = m; St-i = w!) 

= Pr(dt = i; Yt|St_i = m f ) 

= 

can be calculated using the a priori probabilities of the information bits and the 
probability density function p(Yt\Xt). Now the quantity ai(m) = Pr(St = m\ Y*) can be 
calculated by the forward recursion: 

ort(m) = ort_i(mo) • 7t~i( m o> 0) +a t _i(m , 1 ) *7t-i( m i> 1 ) 

using the starting conditions oro(0) = 1, and ao(m) = 0, m = 1, 2, . . . , 2 M — 1. In general, 
Y% stands for the sequence Y a , Y^+x, . . . , Yb-x> The state leading to m with input i is 
denoted m\. 

The quantity A(m) - PriY^St = m) can be calculated as a backward recursion: 

A (to') = P t +i(mo) • 7t+i( m/ j°) + A+l( m 0 ’7t+i( m/ > !) 

with the boundary conditions 0) = 1, and = 0, m = 1, 2, . . . , — 1. The state 

obtained from m! with input i is denoted m*. 

Then the quantity < 7 t{m! ,m) — Pr{St-i — in*; 5$ — m\Y^) can be calculated as: 

) = a t -i(m') • ,m) ■ A(m) 

= a t _x(m') • 7 j(m', *) - Pt{C(m',i)) 

= 

where the function C{m > ,i ) gives the new state when the old state is m! and the input 
d t = t. 

The log-likelyhood ratio is then defined as: 

Pr(dt = \\observation ) 

A(d t ) log p r ^ _ 0j observation ) 

Pridt = 1; observation ) 

— lose ■ 

& Pr(d t = 0; observation ) 

= 

S E m ^(rn’,0) 



Chapter 2 


The Free Distance of TURBO 
Codes. 

2.1 Introduction. 

An important characteristic of every code is its free distance. For large enough signal to 
noise ratios, the free distance will dominate the code’s performance. Therefore, it makes 
sense to start the analysis of Turbo Codes by determining their free distance. 

The project description asks to find the free distance of ‘the’ Turbo Code, thereby refering 
to the code that was introduced by Berrou et al. in [1]. However, this paper defines ‘the’ 
Turbo Code in an ambiguous manner. The exact form of the interleaver and the way the 
constituent encoders are terminated sue not mentioned. As can easily be shown, both 
have an effect on the free distance of the code. 

The interleaver used to obtain the simulation results in [1] is a pseudorandom interleaver. 
This means that it is generated randomly and then used unaltered throughout the simu- 
lation. An example of how the interleaver influences the free distance will be given later 
in this chapter. 

It is not that obvious that the termination of the constituent encoders influences the free 
distance. Consider a Turbo Code with a rectangular interleaver and the constituent en- 
coders chosen as in [1]. ‘Rectangular’ means that the information sequence is written 
linewise and read colummwise, as is the case for interleavers used to decorrelate decoding 
errors in serial coding schemes. In the event that none of the encoders is terminated, the 
free distance cannot be bigger than two. The information sequence consisting of all zeros 
and a single one in the last bit position is interleaved to itself. Thus, only in the last time 
step of the encoding will this single one cause a deviation from the all zero state in the 
trellises of both encoders, which results in a weight two codeword. Now, assume that the 
last four bits of the information sequence are used to terminate the first encoder. Then, the 
codeword weight cannot be smaller than the free distance of the first constituent encoder, 
which equals four after puncturing. Thus, the termination influences the free distance of 
a Turbo Code. 

For the reasons mentioned above, it is not possible to find the free distance of ‘the’ Turbo 
Code as mentioned in task two of the project description. Nevertheless, in this chapter, an 
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Figure 2.1: The creation of a codeword. The information sequences causing the completed 
detours in the first encoder are printed bold. 


algorithm to find the free distance will be introduced. This algorithm assumes termination 
of the first encoder as was proposed by Robertson in [3|. The algorithm will be applied to 
two examples to demonstrate the influence of the interleaver choice on the free distance. 
The results will also show how the error floor is caused by the free distance. 

At the end of this chapter, the problem of interleaver design for maximizing the free dis- 
tance will be formalized and mathematically defined. The approach will be to develop a 
parity-check matrix for a Turbo Code. This approach shows the influence of the interleaver 
on the basis vectors of the code. 

Finally, in the conclusion, a comparison will be made between the work on the free distance 
presented in this report and the results of Robertson in [3]. 

2.2 An algorithm for determining the free distance. 

2.2.1 Theory. 

An im portant characteristic of the Turbo Code is its block length N. The most straight 
forward way to find the free distance of this code is to encode all possible information 
blocks and record the smallest codeword Hamming weight. For any practical values of N, 
however, the number of information blocks, 2 N , is just too high to make this calculation. 
It is the goal of this algorithm to focus only on those information blocks that could pos- 
sibly cause the free distance. The number of information blocks that have to be encoded 
to find the free distance will be drastically reduced. The algorithm only requires an upper 
bound on the free distance and the termination of the first constituent encoder. 

First, an example is presented that shows how a codeword of a Turbo Code is formed. 
This example leads to a better understanding of the algorithm that follows. In Figure 2.1, 
an information block entering the first constituent encoder is shown. Since the first encoder 
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must be terminated, this inf ormation sequence causes a number of completed detours in 
the trellis of the first encoder. A ‘completed detour’ is defined as a path that starts and 
ends in the zero state, but does not return to the zero state between its start - and endpoint. 
This information block is interleaved before it enters the second constituent encoder. In 
general, the interleaved information block will not terminate the second encoder. Thus, 
in the trellis of the second encoder, it will cause a number of completed detours, but also 
an incompleted detour at the end of the trellis. An ‘incompleted detour’ is defined as a 
path that starts in the zero state, but never returns to the zero state. An incompleted 
detour is finite because it stops at the end of the trellis. It is therefore possible that 
the free distance is caused by multiple detours in both constituent encoders, including an 
incompleted detour in the second encoder. 

Every detour in the trellis of an encoder is caused by a unique information sequence 
that has the same length as the detour. In the example above, there are two completed 
detours in the trellis of the first encoder. Therefore the information block has to contain 
the information sequences that cause these detours. Before, between, and after these 
sequences, the information block can contain only zeros since the encoder stays in the zero 
state at these locations. One could say that these sequences are ‘embedded in zeros’. By 
‘embedded in zeros’ it is meant that the bit positions before the first sequence, between 
two consecutive sequences, and after the last sequence, all contain a zero bit. 

Consider an information block that terminates the first encoder: it causes n completed 
detours in the trellis of the encoder. This information block has to contain the n sequences 
that cause these detours. These n sequences are embedded in zeros. 

Consider n information sequences that each cause one completed detour in the trellis of 
the first encoder. An information block that consists out of these n sequences embedded 
in zeros, causes the corresponding detours in the trellis of the first encoder. Since all these 
detours are completed, the information block terminates the first encoder. This implies 
the following result: 

Lemma 1 Given is a Turbo Code with block length N and a terminated first constituent 
encoder. Let S be the set of all information sequences that cause a completed detour, with 
length smaller than or equal to N, in the trellis of the first constituent encoder. Every 
information block terminating the first constituent encoder consists of elements of S em- 
bedded in zeros. But also: Every information block of length N that consists of elements 
of S embedded in zeros, terminates the first constituent encoder. 

This lemma gives a method to construct all information blocks that terminate the 
first constituent encoder of a Turbo Code. This task is equivalent to determining all 
information blocks that consist of elements of S embedded in zeros. Notice that a certain 
element of S may appear more than once in such an information block. The sum of the 
lengths of the elements of S that such am information block contains, has to be smaller 
than the blocklength N. The combined Hamming weight that such an information block 
causes in the first encoder, is the sum of the weights of the detours that it causes. 

To determine all the information blocks terminating the first encoder, the following steps 
are taken: 

• Determine the set S of all information sequences that cause a single completed detour 
in the trellis of the first encoder and whose length is smaller than or equal to N . 
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• Determine all different combinations of elements of S such that a combination may 
contain a certain element more than once, and the sum of the lengths of the sequences 
that a combination contains, is smaller than or equal to N . 

• For each combination found in the previous step, determine all information blocks 
that contain the sequences of the combination by embedding these sequences in 
zeros. 

This lemma is now used to focus only on those information blocks that could possibly 
cause the free distance. Therefore an upper bound on the free distance is needed. Such an 
upper bound can be obtained in three ways: a guess, a good guess relying on simulation 
results, or an analytically derived bound. 

Only the information blocks that cause an output Hamming weight in the first encoder 
that is smaller than or equal to the bound, can cause the free distance of the Turbo Code. 
Thus, only those information blocks have to be considered that consist of elements of S 
such that the sum of the weights of the detours that correspond to these elements, is 
smaller than or equal to the bound. This requirement immediately eliminates all elements 
of S causing a weight bigger than the bound. 

The total output weight caused by an information block is given by the sum of the weights 
caused in each constituent encoder. When an information block is assembled using the 
method above, the output weight it causes in the first encoder, is the combined output 
weight of its detours. The output weight for this information block in the second encoder 
can be determined by simply encoding it with the second encoder. 

These considerations lead to the following algorithm: 

1. Determine an upper bound on the free distance. 

2. In the trellis of the first constituent encoder search all information sequences such 
that each sequence generates one completed detour with output weight smaller than 
or equal to the bound, and length not bigger than the blocklength N. 

3. Determine all different combinations of sequences found in step 2. A combination 
may contain a certain sequence more than once. The sum of the lengths of the 
sequences that a combination contains, is smaller than or equal to N. The sum of 
the output weights that the sequences of a combination cause, is smaller than or 
equal to the bound on the free distance. 

4. For each combination found in step 3, assemble all possible information blocks of 
length N that contain the sequences of the combination by embedding these se- 
quences in zeros. 

5. Interleave all information blocks from step 4, encode the resulting sequences with the 
second constituent encoder, and record the additional output weight in the second 
encoder. If the total output weight is smaller than the bound, the bound takes on 
this new value and the process continues. 

6. The final value of the bound is the free distance of the Turbo Code. 
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The complexity of the algorithm strongly depends on the upper bound for the free distance. 
The number of sequences, found in step 2, increases exponentially with this bound and 
so does the n um ber of information blocks to consider in step 4. The number of possible 
information blocks further depends on the block size N and the puncturing period, but 
only in a polynomial fashion. 

Note that when the output of a constituent encoder is punctured, a detour can cause 
different output weights depending on its starting position in the trellis. The starting 
position of a detour in the trellis determines how its output sequence is punctured. A 
sequence may therefore appear multiple times in S with a different output weight. Thus 
when information blocks axe assembled also the puncturing has to be taken into account 
to determine the output weight this block produces in the first encoder. 

2.2.2 Implementation. 

The algorithm is implemented in two ANSI-C programs: 

The program ‘pathfind’ calculates all the sequences causing completed detours in the trellis 
of the first constituent encoder, with output Hamming weight smaller than or equal to 
the upper bound on the free distance. Also the puncturing is taken into account. The 
program requires the following input: 

• The generators of the first constituent encoder. 

• The puncturing pattern. 

• The upper bound on the free distance. 

• An upper bound on the input weight, which proved useful in the case of a rectangular 
interleaver, see Section 2.4. 

• Name of the output file. 

The following output is produced: 

• Status files. 

• An output file containing a list of the found sequences, including: 

— Input weight. 

— Output weight. 

— Beginning of the sequence relative to the puncturing period. 

— The location of the ones in the sequence. 

The output of ‘pathfind’ is then used to calculate the free distance in the program ‘freedist’. 
‘freedist’ assembles all possible information blocks and encodes them in the second encoder 
after interleaving. In order to function efficiently, it starts encoding the interleaved block 
at a depth in the trellis corresponding to the location of the first one. Up to that point, 
no weight has been accumulated. The encoding lasts until the sum of the weight caused in 
the first encoder and the accumulated weight in the second encoder exceeds the bound, or 
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until the end of the trellis is reached. In the second case a codeword with weight smaller 
than the bound has been found. 

‘freedist’ requires the following input: 

• The file produced by ‘pathfind’ containing the sequences for the first encoder. 

• The upper bound on the free distance. 

• The generators for the second encoder. 

• The puncturing pattern. 

• The interleaver. 

• The name of the output file. 

The output of ‘freedist’ is a file containing the free distance and the information blocks 
that lead to free distance codewords. 

As mentioned in Section 2.2.1, the complexity of the algorithm primarily depends on the 
upper bound for the free distance. This is reflected in the calculation time for both pro- 
grams and the memory requirement to store the sequences found by ‘pathfind’. Therefore, 
if the expected free distance becomes too large, the algorithm becomes unpractical. 

2.3 The free distance for a pseudorandom interleaver. 

2.3.1 Problem description. 

In this section, the free distance algorithm is applied on a Turbo Code with a 64K pseudo- 
random interleaver. The component encoders and the puncturing pattern are chosen as 
in [1]. The first encoder is terminated using the last four bits of the information block. 

2.3.2 Application of the algorithm. 

The free distance of the unpunctured constituent encoder is six and caused by the infor- 
mation sequence 100001. When the output is punctured, the same sequence generates 
the free distance of the punctured first encoder: four. Therefore, the free distance of the 
Turbo Code has to be at least four. This weight is also caused in the first encoder by 
the sequences 10000000001 and 1100011 when these sequences start at odd locations in 
the trellis. This is illustrated in Table 2.1: The output weight that these input sequences 
produce depends on the constituent encoder and the parity of the position where they 
begin in the trellis. Note that the free distance of this Turbo Code could be as small as 
four: the output weight of the input sequence 1100011 can be completely punctured away 
in the second encoder. 

For these reasons the expected value for the free distance was first set to four and then 
increased to five and six. 
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Table 2.1: This table shows the systematic and parity output sequences of a (37,21) RSC 
encoder for three input sequences. For a Turbo Code with puncturing pattern as defined 
in [l], the effective output sequence depends on the constituent encoder and the begin 
position (even or odd) of the input sequence in the information block: the erased bits are 
punctured. 
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2.3.3 Results. 

The free distance was determined to be six, and there are three codewords of this weight. 
Their information blocks are depicted in Figure 2.2. They all consist of an information 
sequence of weight two interleaved to a similar sequence. 

In Figure 2.3, simulation results for this Turbo Code are shown, together with the free 
distance asymptote. 

This free distance asymptote is the free distance term of the union bound for block 
codes and a binary input AWGN channel: 


n< £ £a (t M).£.i.erfc 


d=d 


free 


XV— 1 



The symbols in this formula stand for: 


( 2 . 1 ) 


Pb The bit error rate. 


d The codeword weight. 
w The information weight. 

a(w, d ) The number of codewords with information weight w and codeword weight d. 
N The blocksize or, equivalently, the interleaver size. 

R The rate of the code. 


Eb/N 0 The signal to noise ratio per information bit. 
For the free distance asymptote: 


3- 


__L_ . I . erfc 
65536 2 



Note that, see Figure 2.3, the free distance asymptote is responsible for the error floor 
of this Turbo Code. The reason for the flatness of the error floor is that it is caused by 
a small free distance. For higher signal to noise ratios, however, the ‘floor’ will disappear 
because the bit error rate decreases exponentially as a function of the signal to noise ratio. 
It is important for the performance of this Turbo Code that the number of the free distance 
codewords be small. This leads to a small effective multiplicity: 

xv=l 

of the free distance term of the union bound. The reason why the free distance asymptote 
determines the performance at low signal to noise ratios will de discussed in Chapter 3. 
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Figure 2.2: The creation of the three free distance codewords. The information blocks 
entering the constituent encoders are depicted. The arrows between the mformation blocks 
symbolize the interleaving. The numbers sire the positions in the information block. The 
codeword H am ming weight that these information sequences produce can be checked with 
Table 2.1: it equals six for each codeword. 
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Figure 2.3: The performance of a Turbo Code with 64K pseudorandom interleaver: sim- 
ulation results and the free distance asymptote. 


2.4 The free distance for a rectangular interleaver. 

2.4.1 Problem description. 

In this section, the free distance algorithm is applied to a Turbo Code with a 120 x 120 
rectangular interleaver. The information block is hence 14400 bits long. The information 
bits are written linewise in the interleaver and read columnwise. The component encoders 
and the puncturing pattern are chosen as in [1 J. The first encoder is terminated using the 
last four bits of the information block. 

2.4.2 Application of the algorithm. 

Because of the regular structure of this interleaver, it is possible to find an analytical bound 
on the free distance. The information pattern in Figure 2.4 generates a codeword of weight 
twelve. This can be verified using Table 2.1. The two horizontal 100001 sequences each 
cause outputs of weight four in the first encoder, while the two vertical 100001 sequences 
each cause outputs of weight two in the second encoder. 

An upper bound of twelve on the free distance leads to 43819 sequences in the trellis of 
the first encoder that cause completed detours. However, the number of these detours to 
consider for the algorithm can still be decreased, based on the following facts. 

• Of all the 43819 sequences for the first encoder, the shortest has a length of 5, the 
longest a length of 53. 

• For the second encoder the longest possible detour with weight smaller than or equal 
twelve, has a length equal to 117. Note that for the second encoder both completed 
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Figure 2.4: This input pattern causes a codeword of weight twelve for this Turbo Code 
with rectangular interleaver. 


and incompleted detours must be considered. 

• The interleaver structure and size. 

• From the trellis of the encoders, it can be seen that every information sequence 
causing a completed detour, starts and ends with a one. 

In Figure 2.4, the ones of the two vertical sequences for the second encoder overlap with the 
ones of the two horizontal sequences going into the first encoder. In this way, a codeword 
is formed with weight twelve. 

Now, because the length of the longest sequence to consider is 117 and this is smaller than 
the interleaver width of 120, it is impossible for the ones of a single sequence for the first 
encoder to overlap with the ones of a single sequence for the second encoder. The ones of 
any sequence for the first encoder axe at least separated by 5 positions. After interleaving, 
they are separated by at least 5 x 120 positions, and therefore no sequence for the second 
encoder of length 117 or less can have overlapping ones with the sequence in the first 
encoder. As a matter of fact, one can prove that one needs at least two sequences for 
both encoders to form a pattern such that the ones overlap. Therefore, only information 
blocks that contain two or more sequences for the first encoder can cause free distance 
codewords. 

Because the bound equals twelve and the free distance is four for the punctured first 
encoder, only completed detours with weight up to eight have to be considered for the 
free distance algorithm. Considering a detour of weight nine already leads to codewords of 
weight thirteen, since there axe a minimum of two detours, and the second detour causes 
at least weight four. 

This reasoning reduces the number of sequences to 274. 

2.4.3 Results. 

The free distance of this code was found to be twelve. It is caused by the patterns shown 
in Figure 2.5. 
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Figure 2.6: The performance of a Turbo Code with 120 x 120 rectangular interleaver, 
simulation results and the free distance asymptote. 


It is remarkable that the number of free distance codewords is very high. It is equal 
to the amount of times that these patterns fit into the interleaver. The first pattern, for 
example, fits (y/N - 5) * (>/N - 5) times, causing an equal amount of codewords of weight 
twelve. Due to the puncturing of 10000000001, which needs to be odd to cause weight two 
in the second encoder, the second pattern fits {y/N — 5) * ( Y ^~— ) times. After similar 
considerations for the other patterns, the free distance asymptote is given by. 


{y/N- 5) 2 + 2 


VN- 10 

' O ' 


(y/N-S) + { 


y/N- 10 


4 1 ^ 
— * r * erfc 

N 2 



For N = 14400 this becomes: 


28900 • 


14400 2 


erfc 



This is the free distance asymptote drawn in Figure 2.6. 

The reason why the free distance asymptote was first derived for N in general is because 
it can now be shown that it remains the free distance asymptote as the interleaver size 
increases. This follows from the fact that the bound on the free distance does not change 
for a larger interleaver: the pattern in Figure 2.4 still gives weight twelve. This means that 
the sequences to consider for the algorithm remain the same. The size of the 120 x 120 
interleaver is so large that every pattern formed in a larger interleaver can also be assembled 
in this interleaver. Therefore no codeword with weight smaller than twelve can be formed 
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Figure 2.7: At every point of the trellis of a convolutional code, free distance paths start. 


in the larger interleaver as it can not be done in the 120 x 120 interleaver. As the size 
of the interleaver increases, the free distance asymptote rises and converges toward the 
following curve: 


4.5 • erfc 



Comparing to the previous example of a Turbo Code with a 64K pseudorandom interleaver, 
the performance for this Turbo Code is worse at moderate signal to noise ratios. At first 
this might appear strange because the Turbo Code with rectangular interleaver has a free 
distance that is twice that of the Turbo Code with pseudorandom interleaver. However, 
the number of the free distance codewords is 28900 with the rectangular interleaver versus 
only 3 for the pseudorandom case. It is also interesting that the error floor is not as flat as 
that of the pseudorandom interleaver, as shown in Figure 2.6. The slope of the error floor 
is determined only by the free distance, not by the number of free distance codewords. 
The higher the free distance, the steeper the slope. 


2.5 Turbo Code versus convolutional code. 

In the previous example of a Turbo Code with a rectangular interleaver, the number of 
free distance codewords increases proportionally to the interleaver size N. This results in 
a bad performance. Something similar happens when a convolutional code is terminated 
and treated like a block code. At every point in the trellis, free distance paths start. 
This is illustrated in Figure 2.7. Thus, the number of free distance codewords increases 
proportionally to the block size. 

The Turbo Code with the 64K pseudorandom interleaver, however, has only three code- 
words at the free distance. This effect can not be reached with a convolutional code. The 
only way to improve the asymptotic performance of a convolutional code is by increasing 
the free distance itself, or decreasing the number of free distance paths starting at a given 
point in the trellis. This leads to using complexer convolutional codes with more memory. 
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2.6 The parity-check matrix of a Turbo Code. 

2.6.1 Intentions. 

In Section 2.2.1, a method has been developed to determine the free distance of a Turbo 
Code. The creation of codewords has been explained in terms of detours in the trellises 
of both constituent encoders. A free distance codeword is formed when an information 
sequence causing a low weight path in the first encoder is interleaved to a similar sequence 
for the second encoder. No insight has been given, however, in how to design an interleaver 
for maximizing the free distance and for obtaining a m i n imum number of free distance 
codewords. An analysis will be presented here that defines the problem mathematically, 
giving better insight into the role of the interleaver, and showing the complexity of inter- 
leaver optimization. 

The approach consists in developing the parity check matrix of a particular Turbo Code. 

2.6.2 The parity-check matrix. 

In [16], the following method is given to construct a reduced parity-check matrix (the 
names of the symbols are changed to avoid confusion): 

Theorem 1 Construction of a Reduced Parity- Check Matrix: A reduced parity-check ma- 
trix H for an (Z, N ) q-ary linear code with 1 < N < Z having an encoding matrix G 
(whose rows we denote here by ui,U 2 , - • • , v^ ) can be constructed as follows: 

1. Choose Vi as any vector in F N \ S(v i, V2> • - • j u i-i) f or i — N 4- 1, N 4- 2, . . . , Z. 

2. Form the Z x Z matrix M whose rows are vi,V 2 ,vz- 
Then compute the inverse matrix M~ l . 

3. Take H T as the last Z - N columns of the matrix M~ l . 

Here F stands for GF{q), and S(vi, v 2 , . . . , v*- 1 ) for the vector space spanned by «i, t/ 2 , . . • , *>*• 
The free distance is then equal to the smallest positive integer d such that there are d rows 
of H 7 that are linearly dependent. 

2.6.3 The parity-check matrix of a Turbo Code. 

Consider a Turbo Code with constituent encoders with generators (37,21), as defined in 
[1]. None of the encoders is terminated. Only the systematic output of the second encoder 
is punctured, making this a rate 1/3 code. At first no interleaver is used, but the block 
length equals N. 

The first step in determining the parity-check matrix is to find an encoding matrix G. 
Therefore the N linearly independent information blocks 00 ... lj ... 0 with i — 0, 1, — , N— 

1 are encoded. These sequences contain only a single one at location i. The resulting code- 
words are linearly independent and form a basis for the vector space of the codewords. 
They are used as rows for the encoding matrix. The dimension of the encoding matrix 
is therefore N x 3 N. After reorganizing the columns of the output of the first encoder, 
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such, that the systematic part comes first, followed by the parity part, the encoding matrix 
looks as follows: 

( In*N PtfxN | PtfxN ) 

Here PnxN stands for the parity output of the encoders and InxN for the systematic 
output. InxN is the identity matrix of dimension N x N, PnxN is given by: 

/ 1 1 0 0 1 0 1 0 0 1 0 ... \ 

011001 01001 
001100 1010 0 

^ 000000 . ..0000 1 ) 

Every row of -PjVxJV consists of the previous row, shifted one position to the right, and 
completed with zeros at the front. The encoders are time- invariant. 

One possible choice for the matrix M is: 

InxN PnxN 
OtfxN InxN 
OtfxN OtfxN 

OnxN is the zero matrix of dimension N x N. 

It is easy to verify that M x M = IiNxZN, or that M~ l — M. Therefore H T is given by. 




Now consider the case where an interleaver is used. Every information block 00 ... 1* ... 0 
with i — 0, 1 , . . . , N — 1 would be interleaved to 00 . . . l<j(j) ... 0 where G(i) is a bijection 
from {0,1,..., N — 1} onto itself, and symbolises the function of the interleaver. This 
results in a permutation of the rows of PnxN for the second encoder. ( G ) is then equal 

to 

PnxN Pnxn{C) 

ItfxN OtfxN 

OtfxN LvxN 

where Pnxn{G) symbolises the dependence on the interleaver. 

This analysis shows that: 

• Maximizing the free distance for this Turbo Code with a fixed interleaversize is now 
equivalent to finding a permutation G such that the smallest number of rows of 
FfT (G) that are linearly dependent is maximized. 

• This analysis shows the function of the interleaver on the lowest level: it relates the 
basis vectors of the constituent encoders. 

• For practical block lengths, the number of possible permutations, iV!, is too large to 
find the optimal interleaver. Therefore a systematic way is needed to optimize the 
interleaver for the free distance. 
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2.7 Conclusion. 

In this chapter, the free distance problem of Turbo Codes was addressed. The following 
is achieved: 

• In the introduction it has been shown that ‘the’ Turbo Code is not uniquely defined, 
both the interleaver and the termination of the constituent encoders have an influence 
on the free distance. In [1 J, Berrou et al. have introduced a class of codes, not one 
specific code. This means that task two of the project description, ‘Find the free 
distance of the TURBO code’, cannot be solved literally. 

• An algorithm for calculating the free distance of Turbb Codes was developed and 
implemented. 

• This algorit hm was then applied to a Turbo Code with a 64K pseudorandom inter- 
leaver. This led to an explanation of the error floor. The error floor is caused by 
the free distance asymptote. It looks so flat because of the low free distance of the 
code. However, due to the low multiplicity of the free distance codewords relative 
to the interleaver size, the effective multiplicity, Ylw=i a ( w -> d) ' 77 > * s vei T small so 
that the performance is good. 

• The algorithm was also applied to a Turbo Code with a 120 x 120 rectangular 
interleaver. This code has a bigger free distance, but also a very large number of 
free distance codewords. It was proven that as the interleaver increases, the free 
distance remains twelve, but the performance of the code does not improve since the 
number of free distance codewords grows proportional to the interleaver size N. For 
this Turbo code, the error floor is steeper than in the previous case due to the bigger 
free distance. 

• The free distance problem of a Turbo Code has been compared to that of a ter- 
minated convolutional code. It was shown that the effect of a low number of free 
distance codewords cannot be obtained with a terminated convolutional code. 

• In the last section a different approach was taken. Using the parity-check matrix 
for a Turbo code, the problem of optimizing an interleaver for free distance was 
formalized. This illustrated the function of the interleaver and the complexity of 
the problem. A systematic method is needed to optimize the interleaver for the free 
distance. 

The free distance problem of Turbo Codes with pseudorandom interleavers has already 
been investigated by Robertson in [3J. Robertson has found the free distance of these 
codes and concludes: 

It has been observed that the event leading to the minimum distance of 6 is 
quite rare, it typically occurs just a few times for any particular interleaver. 

Therefore, part of the work described above is a verification of Robertson’s work. However, 
the free distance algorithm, the free distance of a rectangular interleaver, and the parity- 
check matrix is not mentioned in [3]. 



Chapter 3 


The Distance Spectrum of TURBO 
Codes. 


3.1 Introduction. 

This chapter focuses on the distance spectrum of Turbo Codes. The distance spectrum is 
defined as Ylw w ' a ( w ^)i which is a function of d. See Section 2.3 for the notation. It is 
the goal of this chapter to reveal how the distance spectrum changes as the interleaver size 
increases and to show how this change influences the code’s performance. As mentioned 
in Section 1.1, two major characteristics describe the performance of Turbo Codes: As 
the interleaver size increases, the error floor is shifted down, while the point at which the 
error floor stops dominating the code’s performance is shifted to the left. Both of these 
properties can be explained by considering the distance spectrum of Turbo Codes. 

A first analysis of the spectrum is made in Section 3.2. The influence of the interleaver on 
the distance spectrum becomes clear, and the ideal interleaver is defined. 

In [7!, Benedetto et al. introduce the concept of random interleaving to calculate an 
upper bound on the bit error probability of Turbo Codes with given constituent encoders 
and block length. The principles of random interleaving are discussed in Section 3.3. In 
Section 3.4, an improvement for random interleaving is proposed and random interleaving 
is extended to punctured Turbo Codes. 

The effect of increasing the interleaver size on the distance spectrum is investigated in 
Section 3.5 using the concept of random interleaving. Some theoretical results are obtained 
for use later in the chapter. 

The results of Section 3.5 are applied in Section 3.6 to explain the behavior of the error 
floor. Expected values for the free distance and the number of free distance codewords are 
derived. The results show that there exist complex block codes with a low number of free 
distance codewords. 

Finally, in Section 3.7, theoretical and empirical evidence is presented to support the 
concept of spectral thinning for Turbo Codes. A theory is proposed that relates the 
distance spectrum to the code’s performance. The theory predicts that the performance 
of Turbo Codes approaches the Shannon limit as the interleaver size, and thus the codeword 
length, increases. 
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3.2 A first analysis: an ideal interleaver. 

3.2.1 Intention. 

It is the intention of t his section to perform a first analysis for the distance spectrum of a 
Turbo Code. For a constant block length, the influence of the interleaver on the distance 
spectrum will be shown, and compared to the case where no interleaver is used. 

In Section 2.2.1, the creation of a typical codeword is shown: An information block causes 
detours in the first encoder which generates the output weight for the first encoder, in- 
dependent of the interleaver. The information block is then interleaved and encoded by 
the second encoder. However, the weight generated by the second encoder depends very 
strongly on the interleaver. It is intuitively clear that in order to obtain good performance, 
an information block causing a low weight output in the first encoder, should be inter- 
leaved so that it produces a large weight in the second encoder. 

In this section, a limit will be introduced on how much the interleaver can improve the 
performance of the code for a constant block length N . 


3.2.2 A necessary proof. 

Before continuing, the following result will be derived: 

Lemma 2 Assume: 

• An ordered sequence: d\ < d 2 < . . . 5* d+ Si — 5: with i = 1,2,..., M . 

• Let G be a bijection of {1, 2, . . . , M} onto itself. 


• Let D{(G) — di + d(j(i) with i = 1, 2, . . . , M . 

• Define Da v %(G) = which is the average value of Di(G). 

• Define Dy^G) — ( - -- , which is the variance of Di(G). 

Then: 

1. DAvg(G) is independent of G. 

2. The bijection G{i) = i, Vi, maximizes Dy^G). 

3. The bijection G{i) - M - i + 1 ,Vi, minimizes Dy^G). 

The first item is simply proven: 

Efa, DiiG) + <fcw) 2 Sgj * - r 


Dav S (G) = 
For the second item: 


M 


M 


M 


n ri n E“,(GAv g -A(i)) 2 ESx^a mWi 

L'VarU) - i^Var^j — ^ 


C- a 
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/ Af M \ 

= -E^^o) 

\:=l i= 1 / 

Af 

*=i 

> 0 

There is equality if and only if di — ^G(i)? ^ particular for G(i) = i,Vi. 

For the third item: 

Hvar(G)-£>Var(iW-t + l) (3.1) 

Z!i=i(-^Avg — Di (G)) 2 5Zi=i(-^ > Avg — Dj(M — i + 1)) (3 21 

M M 

= Jf ■ f E * ' (<k<0 " j ^ 3 ' 3) 

The last expression equals zero if d G ^ = djtf-t+i, or ^G(i) is a decreasing sequence in 
function of i. In particular is this valid for G(i) = M — i + 1. 

Assume a bijection G', such that dG'(t) is n °t a decreasing sequence in function of i. Then 
it can be shown, that by sorting this sequence pairwise, Dy 3tr {G r ) — Dyar{M — i + 1) 
can be reduced, thereby altering G # , until at last a decreasing order is reached, and 
Dvax{G') — Dy ax (M — i + 1) =0. 

This is now proven: 

Assume a bijection G r , such that d G >^ is not a decreasing sequence in function of i. Then 
there exist j and k with j > k such that d G <(j} > d G t(ky When j > k then dj > dk and 
dM-j+i < d M -k+ 1 - 

If the values of G'(j) and G\k ) are exchanged, the following two terms in the sum of 
Equation 3.1: 

' (d G >{j) — dAf-j+i) + dk • ( d G '(k ) ~ dM-k+i) 

become: 

dj • (d G '( k ) ~ dM-j+i) + d k ’ (d G r(j) - d M - k + 1 ) 

Therefore the sum in Equation 3.1 decreases by an amount equal to the difference of the 
last two expressions. This difference equals: 

{dj - d k ) • {d G >(j) - d G '( k )) > 0 

Thus, sorting the sequence d^qi) pairwise to a decreasing sequence, and thereby altering 
G ; , can only decrease the sum in Equation 3.1 (or keep it unchanged). This fact, and the 
fact that a sorted decreasing sequence makes this sum zero, means that 

£>Var(<?) - £>Var(M - t + 1) > 0 


This proves the third item. 
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3.2.3 The ideal interleaver. 

Now a Turbo Code is considered consisting of two identical non-punctured encoders (not 
even the systematic part of the second encoder is punctured). None of the encoders are 
terminated. The encoders have thus an identical distance spectrum. It is the role of the 
interleaver to assign to every codeword of the first encoder, a codeword of the second 
encoder with the same information weight. 

Lemma 2 can now be used to make some statements about the distance spectrum of this 
Turbo Code. For a certain information weight w, the ordered sequence di(w) < d 2 (w) < 
... < di{w) < ... < d M{ W) (w) with i = 1,2, . . . , M{w) represents the distance spectrum 
of each of the encoders. The bijection G(w ) realizes the function of the interleaver.. The 
resulting spectrum of the Turbo Code is given by the sequence Di{G(w)) (for every infor- 
mation weight separately!). 

Applying Lemma 2 in this manner leads to the following conclusions about the distance 
spectrum of these Turbo Codes: 

1. The average value of the weights of the codewords is not a function of the interleaver. 

2. Using no interleaver maximizes the variance of the weights of the codewords. 

3. Using an ideal interleaver that interleaves the ordered sequence d\(w) < d 2 {w) < 
... < di(w) < . . . < djvf(u;)(u;) with G(i) = M(w) — i 4-1 minimizes the variance of 
the weights of the codewords. This interleaver relates low weight codewords for the 
first encoder to high weight codewords for the second encoder and vice-versa. 

4. The use of any other interleaver will result in a variance of the codeword weight 
between these two extremes. 

Therefore the use of an interleaver pushes the spectrum together, towards the average 
codeword weight, thereby shifting low weight codewords to higher weight. 

Notice that the ideal interleaver, as described above, cannot be realized in general. It 
is impossible to independently relate the codewords of the first constituent encoder to 
the codewords of the second constituent encoder, as is assumed here. As described in 
Section 2.6.3, only the basis vectors can be related independently! Therefore, these Turbo 
Codes with the ideal interleaver will lead to a very weak lower bound on the performance 
of the code for a certain block length and component encoders. 

In Figure 3.1, three spectra are drawn for a Turbo Code with (37,21) encoders and block 
length N - 20: the spectrum without interleaver, the spectrum for a pseudo-random 
interleaver, and the spectrum with ideal interleaver. The spectrum for the ideal interleaver 
has a very unrealistic shape: this is expected because the interleaver cannot be realized. 
Compared to using no interleaver, the pseudorandom interleaver compresses the spectrum, 
as expected. 

3.3 The principles of random interleaving. 

Random interleaving for Turbo Codes has been introduced by Benedetto et al. in [7|. In 
[9) the method has been used to evaluate an upper bound on the bit error probability 
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Figure 3.1: The distance spectrum of a Turbo Code: without interleaver, a pseudorandom 
interleaver, and the ideal interleaver, ’m’ stands for * a(w,d), see Equation 2.1. d 

stands for the Hamming weight of the codewords. 


of Turbo Codes with given constituent encoders and block length. The method will be 
shortly explained here. 

Assume two information blocks, with weight w , causing detours in both constituent en- 
coders, as in Figure 3.2. The probability that the information block entering the first 
encoder is interleaved to the information block for the second encoder, is equal to the 
number of interleavers that interleave the first block to the second block, divided by the 
total number of interleavers. This, off course, assumes a uniform distribution over the in- 
terleavers. The total number of interleavers of length N that interleave correctly is given 
by w\ • (N — w)l: There are tu! ways of interleaving the ones to the ones, and (N — w)\ 
ways of interleaving the zeros to the zeros. The total number of interleavers of length N , 



RSC 1 


RSC 2 


Figure 3.2: 
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is given by Nl. Therefore this probability equals w ' or : 

(t) 

Next, Benedetto et al. introduced the input-redundancy weight enumerating function 
of a systematic block code as: 


A c (W,Z)='£A i jW i Z' 

where A%j stands for the n um ber of codewords caused by an information block of Hamming 
weight i whose parity check bits have Hamming weight j, so that the total Hamming weight 
is i 4- j. 

To stay consistent with the notation used previously in this report, the following equivalent 
function is introduced, the input-output weight enumerating function (IOWEF): 

A C (W, D)=Yl A^ d W w D d (3.4) 

w,d 

where A ^ d stands for the number of codewords caused by an information block of Ham- 
ming weight w and that have total Hamming weight d. The two functions are equivalent 
because i + j — w 4- j ' = d. 

The Equation 3.4 can be rewritten as: 

A c (W,D) = Y. A ^ WW 

w 

with: 

Ag(D)=Y, A ^ Dd 

d 

The constituent codes of a Turbo Code can be considered as block codes since the in- 
formation blocks are finite. Therefore they have IOWEF’s A C ^(W,D) and A ( '^{W, D). 
Benedetto and al. proved that for the IOWEF of the Turbo Code, .A '^^'(W,D), averaged 
over all possible interleavers: 



The difficulty lies in calculating the terms A ^ d for each of the constituent encoders. The 
approach is to calculate how a particular completed detour, or series of completed detours , 
contributes to these terms. A ‘series of detours’ is defined as an ordered group of detours. 
This group may contain a particular detour more than once. The ‘output weight of a 
series of detours is the sum of the output Hamming weights of the detours. The ‘length 
of a series of detours is defined as the sum of the lenghts of the information sequences that 
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Figure 3.3: There is only one information block that causes this uncompleted detour in 
the trellis of a constituent encoder. 



Figure 3.4: This series of three detours contains two completed detours and one incom- 
pleted detour. 


encoder, and in this specific order, is given by: 

( N -l + (n-l)\ (3.7) 

V (n-1) / 

It equals the number of partitions of N — l into n numbers: there are N — l zeros to be 
divided over n locations before and between the sequences, but not after the last sequence. 
This series of detours contributes with this amount to 

Neglecting information blocks that do not terminate the constituent encoder as in [9] leads 
to incomplete spectra for Turbo Codes averaged over all interleavers. 

3.4.2 Extension to punctured codes. 

Up to this point, only non-punctured constituent encoders were considered for random 
interleaving. However, most Turbo coding schemes use puncturing to increase the spectral 
efficiency. 

An example shows the influence of the puncturing on random interleaving: Consider 
a constituent encoder with generator (37,21) and whose output is punctured as the first 
constituent encoder in [1]. Figure 3.5 shows two information blocks going into this encoder 
with N - 10. The information blocks cause the same completed detour in the trellis: they 
both contain the sequence 11111 that causes the completed detour. Nevertheless, they 
produce a different output Hamming weight due to the puncturing. Instead of contributing 
with jV — i + 1 = 10 — 5 + 1 = 6 to A 04 for some d, this detour contributes with 3 to 
A 5i 5 and 3 to A 5J . The best way to think of this, is that there are actually two different 
sequences causing different output weights, but that there are restrictions on the positions 
where these sequences may begin in the information block: one sequence has to start on 
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Figure 3.5: These identical sequences 11111 start at different positions in the information 
block. They produce different output weight due to the puncturing. 


an even position, the other on an odd position. 

In general, with a puncturing period of T, every completed detour for the non-punctured 
encoder leads to T completed detours for the punctured encoder with different output 
Hamming weights. Any of the T information sequences of these detours can only start 
at a limited number of positions in the information block: those positions that have the 
same value modulo T. This means that the contribution of these detours to the distance 
spectrum of the constituent encoder becomes more complicated to calculate: 

Consider a series of n detours. The information sequence causing the i-th detour can only 
start at those positions whose modulo T value equals pi, i = 1, 2, . . . , n. pi takes on a value 
in the set {0, 1, . . . , T-l}. Now an information block of length N is constructed containing 
the ordered information sequences of the detours in the series. These sequences are placed 
so that the number of positions before the first sequence, and the number of positions 
between two consecutive sequences, is minimized so that this number takes on a value in 
the set {0, 1, ... ,T — 1}. The n um ber of positions that remain after the last sequence is 
called L. The spaces before, between, and after the sequences are filled with zeros. This 
information block causes the series of detours in the trellis of the constituent encoder. 
Moreover, by taking units of T zeros from the L positions at the end of this information 
block and inserting them before and between the sequences, more information blocks are 
formed that cause this series of detours. Note that by inserting T zeros between two 
sequences, the requirement on the begin positions of the sequences is not violated. The 
number of partitions of |_^J units of T zeros into n + 1 locations before, between, and after 
the sequences, equals: 

/ LyJ + ” ^ (3. 8 ) 

This is the total number of information blocks that cause this series of detours and it 
equals the contribution of this series of detours to d . 

It is possible to derive an upper and lower bound for this contribution. L takes on the 
maximum value N~l when the numbers of positions before the first sequence, and between 
consecutive sequences, are all zero. I is the sum of the lenghts of the information sequences. 
The upper bound on Equation 3.8 is therefore: 

/ l^J+n 


(3.9) 
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L takes on the minimum value N — l — n • (T — 1) when the numbers of positions before 
the first sequence, and between consecutive sequences, are all T - 1. The lower bound on 
Equation 3.8 is therefore: 

i N—l+n t 

L T J 


( l T J ) 


(3.10) 


For N > i, n, T, the two bounds approach one another. 

3.4.3 Calculating a few terms of the average distance spectrum. 

Calculating the whole distance spectrum will only be feasible for Turbo Codes with a small 
block size N. Hereafter, a method will be explained to calculate the distance spectrum of 
a Turbo Code, averaged over all interleavers, up to a certain weight d max . Only the first 
encoder is terminated. 

For the average Turbo Code, A^ c is defined as: 


N 


N 




= w • ^5 = X! w • a ( u, ’ d ) 


(3.11) 


tu=l 


tl!=l 


where a (in, d ) = A ^ c d . According to Equation 3.5: 


TC ^r+ y — d ^w,x 
y d 


aC2 
■ n ~w,y 


C) 


( 3 . 12 ) 


The problem is now reduced to determining at most A^ d , A^ 2 d for d < d m&x and w < N. 
For the first encoder, all completed detours are determined with output Hamming weight 
df < d max and length V < N. Using these detours, all series of completed detours with 
output weight d < dmax and length l < N are assembled and their contributions to A w d 
are calculated using the techniques explained in the previous sections. 

A similar calculation is performed for the second encoder, but incompleted detours are 

also considered. 

The number of completed detours to consider increases exponentially with d ma x and so 
does the complexity of this calculation. 


3.5 The effect of increasing the interleaver size on one term 
of the distance spectrum for average Turbo Codes. 

3.5.1 A property of recursive convolutional codes. 

The constituent encoders of Turbo Codes are (2,l,m) recursive systematic convolutional 
encoders with memory size m. For a recursive convolutional code, the input in the memory 
at every step is a linear combination of the entering information bit and at least one 

memory bit. 
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Information input 



Figure 3.6: The general feedback form of a recursive convolutional encoder. 


Lemma 3 Given a (2, 1, m) recursive convolutional encoder with memory size m, every 
completed detour in the trellis of this encoder is caused by an information sequence with 
Hamming weight at least two. 

Proof: Figure 3.6 shows the general feedback form of the encoder. 

First, it will be shown that a single information bit with value 1 is needed to take the 
encoder away from the zero state. Consider an encoder in the all zero state. . When an 
information bit with value 0 enters the encoder, the input into the memory is a linear 
combination of this 0 and the zero state of the encoder: the encoder must remain in the 
zero state. However, when an information bit with value 1 enters the encoder, 100 ... 00 
becomes the new state. 

If the encoder returns to the zero state during step t, then the inputs into the memory at 
steps t, t - 1, . . . , t - m 4- 1 all have to be zero. Before step t - m + 1, the state of the 
encoder has a 1 in the first position, otherwise the zero state would have been reached 
earlier. Starting from this state, m information bits are inserted in the encoder, such that 
there are m zero inputs entering the encoder’s memory. It will now be shown that one of 
these information bits has to be a 1. After a certain number of steps, the 1 that used to 
be in the first position of the memory, occupies the last position that is fed back to the 
input. At this point an information bit with value 1 needs te be inserted so that the input 
in the memory, which is now the sum of this memory bit with value 1 and the information 
bit, is zero. 

It has been shown that a single 1 is needed to leave the zero state, and at least another 1 
to return to the zero state. 

3.5.2 Increasing the interleaver size for average Turbo Codes. 

In Section 3.4.3, a method was explained to determine a few terms of the distance spectrum 
of the average Turbo Code. The bottom line was to calculate A^ d and A^ 2 d for a limited 
number of pairs (w, d) using series of detours for each of the encoders. Then, A w (D) can 
be determined using Equation 3.12. In this manner, every series of detours for the first 
encoder is randomly interleaved to every series of detours for the second encoder that has 
the same information weight w. It is possible, however, to skip the calculation of A w d 
and A^ 2 . and calculate the direct contribution of a specific pair of series of detours, one 
series for each encoder, to A(W,D) TC . 

Consider for the ith encoder, i = 1,2, a series of n, detours. This series has a length Z«, 
an output weight d*, and an information weight w. 
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For the first encoder, which is terminated, all the detours of the series are completed. If 
the detours of the series for the second encoder are also completed, then the contribution 
of this pair of series to A ^ d l+d2 equals: 


f N - II + Til \ 

f N — 1 2 + n 2 A 

\ rii ) 

V n 2 ) 


C) 


(3.13) 


based on Equation 3.5 and Equation 3.6. 

If the series of detours for the second encoder contains an incompleted detour, then 
the contribution of this pair of series to A ^ d l+d2 equals: 


f N — 1\ + ni \ / N - h + (^2 “ I) \ 

V rti ) V (n 2 — 1) ) (3.i4) 



based on Equation 3.5, Equation 3.6, and Equation 3.7. 

First, the influence of increasing N on Equation 3.13 is investigated. Consider N » 
/i,/ 2 ,ni,n 2 , then Equation 3.13 can be approximated as: 

W * . pfni+ri2-w (3.15) 

ni! • n 2 ! 

Due to Lemma 3 and the fact that all detours are completed, the minimum information 
weight per detour is two, and therefore w > 2 • max(ni,n 2 ). Assume without loss of 
generality n\ > n 2 , then w > 2 • fii. Three cases are considered. 

• m > n 2 : The exponent of N in Equation 3.15 is always negative. The contribution 

of this pair of series to decreases as N increases. 

• m = n 2 and w > 2 • ni.: The exponent of N in Equation 3.15 is always negative. 
Same conclusion. 

• ni _ n2 ^ w - 2 . ni : The exponent of AT in Equation 3.15 is zero. The contribution 

of this pair of series to converges to a finite non-zero value as N increases. 

Note that the series contain an equal number of completed detours, and all these 
detours have information weight two: they are caused by information sequences with 
Hamming weight two. 

Next, the infl uence of increasing N on Equation 3.14 is investigated. As above, consider 
N » li,l 2 ,ni,n 2 . Equation 3.14 can be approximated as: 

. jyni+n 2 -w-l (3.16) 

nx! • (n 2 - 1)! 

For the same reason as mentioned above, w > 2 ■ n L . For the second encoder all the 
completed detours have at least information Hamming weight two, but the incompleted 
detour may have information Hamming weight one, and therefore w > 2 • (n 2 1) + 1 

2 • n 2 — 1. Three cases are considered: 
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• m > n 2 : This implies w >2 • n L . The exponent of N in Equation 3.16 is always 
negative. The contribution of this pair of series to AJ^ x+y decreases as N increases. 

• m — n 2 : Same conclusion. 

• m < n 2 : This implies w > 2 * n 2 — 1. Same conclusion. 

This analysis leads to the following lemma: 

Lemma 4 A Turbo Code defined by two recursive convolutional constituent encoders is 
given. The first constituent encoder is terminated , the second encoder not. Two series 
of detours, one series for each encoder, is given. The series have the same information 
weight. The contribution of this pair of series to the spectrum of the Turbo Code, averaged 
over all interleavers with a uniform distribution over the interleavers, converges to a finite 
non-zero value as the interleaver size increases, if and only if: 

• the two series contain only completed detours, 

• the two series contain an equal amount of completed detours, 

• and these detours have information weight two. 

In any other case, the contribution of this pair of series converges to zero. 

Note that a p un cturing pattern is not mentioned in this lemma: Equation 3.13 and Equa- 
tion 3.14 are only valid for non-punctured encoders. However, the proof can be repeated 
with the upper bound, Equation 3.9, and the lower bound, Equation 3.10, that can be ex- 
tended to incompleted detours. Therefore, the lemma is also valid for punctured encoders. 
The proof can also be repeated for the case where none of the encoders is terminated. The 
result is identical. 

3.5.3 One term of the distance spectrum. 

In Section 3.4.3, a method was discussed to calculate some terms of the distance spectrum 
of an average Turbo Code. First, using series of detours for each of the constituent 
encoders, A^ l d and A^ 2 d are calculated for some {w,d). Then using Equation 3.12, A w ^ d 
can be determined. 

In the previous section, it has been shown that the influence on the Turbo Code’s spectrum 
of a pair of series of detours, one series for each encoder, can be calculated directly without 
first determining A^ d and A^ 2 d . Therefore, an equivalent method for calculating is 
the following: Determine all pairs of series of detours, one series for each encoder, such 
that all pairs have information Hamming weight w and output Hamming weight d. The 
output Hamming weight of a pair of series is the sum of the output weight of the series. 
The number of these pairs is finite. Omit those pairs that contain series whose length is 
greater than the information block size N. Then, calculate the influence on A^ d for every 
pair separately. 

The results from the previous section are used to predict how A^ d evolves for fixed ( w , d) 
as the information block size, equivalent to the interleaver size N, increases. Consider 
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the information block size larger than the length of every series. Then the number of 
pairs of series that contribute to A does not increase as N increases. According to 
Lemma 4, only those contributions of pairs of series that consist of completed detours 
with information weight two, converge to a value different from zero. Thus, it can be 
concluded that for large enough N , the contributions of detours with information weight 
two dominate A^ c d . 

Lemma 5 A Turbo Code defined by two recursive convolutional constituent encoders and 
a puncturing pattern is given. The information block size is N. For a certain (w,d), the 
contribution to A^ c d of pairs of series, one series for each constituent encoder, that contain 
only completed detours with information weight two, is A! . 

For every e, there exists N r such that | A^ d — A!\ < e for every N > N'. 

For the distance spectrum of the average Turbo Code, the total multiplicity of the 
codewords at distance d is A ^ ' A w°d * 

A non- catastrophic Turbo Code is defined as a Turbo Code for which Vd, 3.w , inde- 
pendent of the interleaver size, such that A^ c d — 0, Vw > w' . For a given d, the number of 
terms in the sum of Equation 3.11 is then the same for interleaver sizes N > w' . Therefore, 
see Equation 3.12, it is sufficient that one of the constituent encoders is non-catastrophic. 
This is always the case if one of the constituent encoders is systematic and its systematic 
output is not punctured. For the remaining part of this chapter only non-catastrophic 
Turbo Codes are considered. 

With the definition of a non-catastrophic Turbo Code, the previous lemma leads to: 

Lemma 6 A Turbo Code defined by two recursive convolutional constituent encoders and 
a puncturing pattern is given. The Turbo Code is non-catastrophic. The information block 
size is N. For a certain d, the contribution to A of all pairs of series, one series for 
each encoder, that contain only completed detours with information weight two, is A . 

For every e, there exists N f such that \A — A ! | < e for every N > N'. 

Every term of the distance spectrum of an average Turbo Code converges to a finite 
value as the interleaver size increases. 

3.6 The free distance of an average Turbo Code. 

3.6.1 Theory. 

In this section, it will be shown that free distance codewords of Turbo Codes most likely 
originate from a single completed detour with information Hamming weight two in the 
trellis of each constituent encoder. 

Consider the spectrum of the average Turbo Code up to Hamming weight d max . d ma x 
is the smallest codeword weight for which a pair of series of detours contributes to A d , 
where both series only consist of completed detours with information weight two. This 
weight d max can easily be found, as well as all such pairs of series: 

For the i-th encoder, i — 1,2, consider the set Si of completed detours with information 
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weight two that cause the minimum output weight dj compared to other completed detours 
with information weight two for this encoder. Every element of Si corresponds to a series 
of detours containing only this detour. Taking one of these series for each encoder, a 
pair of series is obtained that contributes to and for which the series consist of 

completed detours with information weight two. By construction, no other pair of series 
for which the series consist of completed detours with information weight two contributes 
to Aj c for d < di + d 2 - 

According to Lemma 6, as the interleaver size N increases, the multiplicities A% c for 
d < d\ 4- d 2 converge to zero. However, the multiplicity Ajjf^ converges to a finite non- 
zero value that only depends on the contribution of the pairs constructed above. A^ +d2 
converges to: 


iim A 

jV— * oo 


TC 

di+fc 


— 2- A 


TC 

2,d\+d2 



(3.17) 


The following results: Free distance codewords of Turbo Codes are likely to originate from 
a single completed detour with information Hamming weight two in the trellis of each 
constituent encoder. The construction described above makes it possible to calculate the 
expected value of the free distance and the expected number of free distance codewords. 
The interleaver size has to be ‘large enough’. 

Note that this result can be used to obtain an upper bound on the error floor that can be 
reached with a particular Turbo Code and interleaver size. There have to be Turbo Codes 
with a lower error floor than the average Turbo Code. The error floor of the average Turbo 
Code is given by: 

f^-erfc (j R . [dl + d ^ 


for N sufficiently large. 

Note that as N increases, A^_ d2 converges to a finite non-zero value. Therefore: As the 
interleaver size of an average Turbo Code increases, its error floor shifts down proportion- 
ally to jf. As the interleaver size increases, it is possible to find interleavers that lower 
the error floor. This theoretical result confirms the simulation results in [3]. An example 
in the following section demonstrates the theory. 


3.6.2 An example. 

Consider a Turbo Code with constituent encoders and puncturing pattern as in [1]. The 
information block size, or interleaver size, is N. The construction of Section 3.6.1 is 
applied to calculate the expected value of the free distance and the expected number of 
free distance codewords. The following results are obtained: 

• The set Si contains three detours that cause d\ — 4. 

• The set S 2 contains three detours that cause ^2 — 2. 

• 43 = A % = l^j + L^J + l^J « I - JV 
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InterteavBfsLza N 


Figure 3.7: This plot shows the convergence of m = A to 4.5. 


* A^ +d , = Ell ® • ^5.4* = * 2 • 

The third result is calculated using the elements of S\, S 2 and Equation 3.8. Now, using 
Equation 3.17: 



The expected value of the free distance is six, and the expected number of the free dis- 
tance codewords, Agg, is approximately 4.5 for ‘large’ interleavers. Figure 3.7 shows the 
convergence of to 4.5. For N — 1000, A^q equals 4.42. For a practical interleaver 
size of 64K, 4.5 will be a very good approximation. For obtaining Aj c , needs to be 
multiplied by the information Hamming weight w — 2. 

These conclusions are confirmed by the results of Section 2.3. There the free distance code- 
words are determined for a particular Turbo Code with the same constituent encoders, 
punct ur ing pattern, and a 64K pseudorandom interleaver. This code has a free distance 
equal to six and three free distance codewords. Moreover, the form of the free distance 
codewords is exactly as described above: The free distance codewords cause a single com- 
pleted detour of inf ormation weight two in the trellis of each constituent encoder. 

Using the union bound, Equation 2.1, the free distance asymptote of the average Turbo 
Code is given by: 
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Figure 3.8: This plot shows the convergence of m = A^ c for d < 6. As the interleaver 
size N increases, A converges to zero for d < 6. ‘d’ stands for the codeword Hamming 
weight. 


This is an upper bound on the obtainable error floor for a particular Turbo Code. As the 
interleaver size increases, the bound is shifted down proportionally to jf. 

Figure 3.8 shows how A^ c converges to zero for d < d\ + di — 6. 


3.7 Spectral thinning for increasing interleavers. 

3.T.1 Intentions. 

In the previous section, the behavior of the error floor in function of the interleaver size 
was explained. As the interleaver size increases, the error floor is shifted down. There is, 
however, a second characteristic of the performance curve that vanes with the interleaver 
size: As the interleaver size increases, the error floor dominates the performance down to 
smaller signal to noise ratios, see Figure 1.2 in Section 1.1. In this section an explanation 
for this behavior is proposed. First, some theoretical evidence is developed to support the 
idea of spectral thinning. Then, empirical evidence is presented that shows the spectral 
thinning for a particular Turbo Code. Finally, a theory is proposed that shows how spectral 
thinning influences the code’s performance. 

3.7.2 Theoretical evidence for spectral thinning. 

Section 3.5.3 treats the behavior of one term in the distance spectrum of an average Turbo 
Code. It was shown that as the interleaver size increases, A% c converges to a finite value 
that depends only on detours with information weight two. 
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In Section 3.6 this result was used to show how, for large enough interleavers, detours with 
information weight two determine the free distance of a Turbo Code. 

It is now possible to predict how the spectrum changes as the interleaver size increases. 
The evolution of a certain term AJ^p in function of the interleaver size is discussed. 

In Section(1.5.3) it was shown that Aj c can be calculated as follows: 

1. For each encoder determine all detours with output Hamming weight smaller than 
or equal to d', and length smaller than or equal to the interleaver size N . For the 
first encoder only completed detours are considered, for the second encoder also 
incompleted detours. 

2. For each encoder determine all series of detours such that each series has an output 
weight equal to d 1 and a length smaller than or equal to N . 

3. Determine all pairs of series of detours, one series for each encoder, such that the 
output weight of each pair equals d!. 

4. Calculate the contribution of each pair to A^F using Equation 3.13, Equation 3.14, 
or the equivalent of these equations for punctured Turbo Codes. ■ 

This algorithm can now be used to predict how Aj c varies when the interleaver size 
increases: 

• Assume d* > where R stands for the rate of the Turbo Code, then A d> 0 
because the codeword weight cannot be larger than the codeword length. 

• As the interleaver increases, the number of pairs of series that contribute to A d> 
increases because for an increasing number of series with output weight smaller than 
or equal to d', the length will be smaller than or equal to N . Therefore AF d F increases. 

• Above a certain size, increasing the interleaver no longer leads to more pairs that 
contribute to AF d F . This effect was explained in Section 3.5.3. When the interleaver 
size keeps increasing, the contributions to A^F of those pairs of series, where the 
series do not only consist of completed detours with information weight two, converge 
to zero. Therefore, AF d F decreases as it converges towards its final value, determined 
only by completed detours with information weight two. This effect is referred to as 
‘spectral thinning’. 

Therefore, the spectrum evolves as shown in Figure 3.9. In the next section, an example 
illustrates this behavior. 

3.7.3 Empirical evidence for spectral thinning. 

Consider a Turbo Code with constituent encoders and interleaving pattern defined as in 
[1]. For this code, random interleaving was performed using detours with information 
weight up to 4 and for codeword weights up to 20. Figure 3.10 shows the results for 
N = 50, 100, oo. As the interleaver size increases, the spectrum converges to the spectrum 
of the average Turbo Code with N = oo. This means that the terms converge to zero, or a 
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Figure 3.9: This picture shows in principle how the spectrum evolves as the interleaver 
size increases. 



Figure 3.10: This plot shows the effect of spectral thinning for a particular Turbo Code. 
The spectra were obtained using the algorithm of Section 3.4.3. m = A ^ c . 
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non-zero value. Notice that is smaller for the N — 50 code than for the N — 100 code. 
This is not in contradiction with the results of Section 3.7.2! For this weight, the length of 
the interleaver limits the number of pairs that contribute to A^q' : for many series, the sum 
of the lengths of the detours exceeds the interleaver size. As the interleaver size increases 
from 50 to 100, more pairs of series contribute to • 

Only detours with information weight up to 4 were used to keep the calculation time 
feasible. It can be expected that using all relevant detours would make the effect of 
spectral thinning stronger. 


3.7.4 A theory of spectral thinning. 

In the two previous sections, theoretical and empirical evidence is presented that supports 
the effect of spectral thinning for Turbo Codes as the interleaver size increases. In this 
section, a theory is proposed that shows the effect of spectral thinning on the performance 
of Turbo Codes. The theory explains why, for increasing interleaver size, the error floor 
dominates the performance down to smaller signal to noise ratios, and therefore, the bit 
error rate curve becomes steeper. 

Figure 3.11 shows a part of the spectrum for a Turbo Code with infinite interleaver. The 
component encoders and the puncturing pattern have been chosen as in [Ij. In Figure 3.12, 
a few terms of the uni on bound are plotted that correspond to the spectrum of this Turbo 
Code with infi nite interleaver. The free distance term exceeds all other terms: it domi- 
nates the performance of this code. 

Figure 3.11 also shows parts of the spectra for two other Turbo Codes with finite inter- 
leaver. These two spectra are fictional: they have not been calculated as is the case for 
the Turbo Code with infinite interleaver. However, they illustrate the effect of the spec- 
tral thinning described in the previous sections: as the interleaver increases, the spectrum 
converges to the spectrum of the code with infinite interleaver. Figure 3.12 also shows the 
terms of the union bound corresponding to these spectra. Note that the free distance term 
does not dominate the performance at low signal to noise ratios: there is a higher distance 
term that cuts the free distance term. Thus, the free distance term only dominates the 
performance for signal to noise ratios above the point where it is cut by a higher order 
term. This point moves to lower signal to noise ratios for increasing interleavers. 

This theory explains the shape of the performance curves of Turbo Codes. As the inter- 
leaver size increases, the spectrum thins and the error floor dominates the performance 
down to smaller signal to noise ratios. 

In reality the effects of the error floor shifting down, and the error floor dominating down 
to smaller SNR, are not independent. Both effects are a consequence of the fact that each 
term in the spectrum converges to a finite value. As the interleaver size increases, these 
effects happen simultaneously. There is no reason why this mechanism should stop when 
a certain interleaver size is reached. Therefore this theory predicts that as the interleaver 
size increases, the performance of Turbo Codes approaches the Shannon limit. 
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Figure 3.11: This plot shows three spectra of Turbo Codes. The spectrum for the infinite 
interleaver is exact, the two other spectra are fictional. The spectra show the behavior of 
spectral thinning, m — A 

3.8 Conclusion. 

This chapter has focused on the distance spectrum of Turbo Codes to explain their per- 
formance. The following conclusions can be drawn: 

• Compared to the event that no interleaver is used, the interleaver causes the variance 
of the spectrum to decrease. It does this by combining low weight codewords for 
the first constituent encoder with high weight codewords for the second constituent 
encoder, and vice-versa. 

• Every term of the distance spectrum of Turbo Codes with fixed constituent encoders 
and puncturing pattern, averaged over all interleavers, converges to a finite value as 
the interleaver size increases. 

• For a sufficiently large interleaver size, the free distance codewords most likely cause a 
single completed detour with information weight two in the trellis of each constituent 
encoder. It is relatively simple to calculate the expected values for the free distance 
and the number of free distance codewords. 

• The error floor for a Turbo Code, averaged over all interleavers, is an upper bound 
for the error floor of this Turbo Code with a particular interleaver. This bound shifts 
down as the interleaver size increases. 

• Spectral thinnig occurs as the interleaver size increases. This effect supports a theory 
that explains the shape of the Turbo Code’s performance curves. 
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Figure 3.12: This plot shows the influence of spectral thinning on the performance of Turbo 
Codes. Each of the graphs above depicts a few terms of the union bound corresponding 
to the spectra in Figure 3.11. Since only the relative position of the terms is important, 
Equation 2.1 was used with N — 1. Note that the free distance term dominates the 
performance down to smaller signal to noise ratios as the interleaver size increases. 
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• The combined effects of the error floor and spectral thinning imply that the per- 
formance of a Turbo Code approaches the S hann on limit as the interleaver size 

increases. 

The method of random interleaving was improved and extended to punctured codes in 
order to obtain these results. 



Chapter 4 


The Decoding Complexity of 
TURBO Codes. 


4.1 Introduction. 

In engineering every decision is a compromise between performance and cost. A careful 
performance versus complexity comparison is necessary to choose a code for a given appli- 
cation. It would therefore be incomplete to study the performance of Turbo Codes without 
addressing the decoder’s complexity. In this chapter the decoding complexity of Turbo 
Codes is investigated and compared to the decoding complexity of a Viterbi decoder. 

4.2 The decoding complexity. 

The decoding complexity of Turbo Codes was succesfully investigated by Andersen in 
[5], Therefore the results presented in this section differ little from Andersens . Also the 
graphical presentation of the operations originates from [5}. 

The decoding complexity is measured as the number of operations per information bit that 
are required to decode. 

First the decoding complexity of the MAP algorithm is considered. For the notations used 
in this section, see Section 1.4, where the MAP algorithm is discussed. Note that all the 
operations are floating point. In practice rescaling is needed. Figure 4.1 and Figure 4.2 
illustrate the calculation of a t (m), and ,t). 

The calculation of art(m) requires two multiplications and one addition for each state. 
Ass um ing an encoder with memory size M 1 this amounts to 2 * 2^ multiplications and 
2 lVf additions. The same number of operations is needed to calculate /?t(m / ). o t (m , i) 
is calculated along with /^(m^) so two more multiplications are needed for every state. 
For calculating the log-Ukelyhood ratio A(dj), the values of c4(m ; ,i) have to be added 
for every state, and a final division is required. Table 4.1 presents the total number of 
operations per information bit for the MAP algorithm. Division is assumed to have the 
same complexity as multiplication. 

One decoding iteration requires that the MAP algorithm be applied twice. In [1], con- 
stituent encoders with memory size M — 4 are used and simulation results are published 
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figure illustrates the calculation of oi{(m). 



Additions 

Multiplications 

MAP decoder 

4 • 2 M 

6 • 2 M 4- 1 

Turbo decoder 
1 iteration 

S~2 m 

12 - 2 M + 2 

Turbo decoder M=4 
18 iterations 

2304 

3492 


Table 4.1: The decoding complexity of a Turbo Code per information bit. 


for eighteen iterations. The table also includes the decoding complexity for this Turbo 
Code. 

The channel output and the information exchange between the decoders are assumed to 
be quantized. When this is the case, / y , t (m , ,i) can be obtained by 2 • 2 M table look-up 
operations to calculate at( m). The values of are then stored to calculate Ptiyn ). 

The decoder memory requirements can be estimated as (3 • 2 M + 4) • N words. N is the 
information block size. at(m) and 'yj(m / ,i) need to be stored for the complete block, re- 
quiring 3 ■ 2 M • N words. Also the complete input block for the encoder must be stored 
and the two interleavers need 2 • N words. 

Note that the deco din g complexity per information bit only depends on the memory size 
of the constituent encoder, and not on the information block size N. Increasing the in- 
formation block size improves the performance of Turbo Codes, but does not increase the 
decoding complexity per information bit. 


4.3 Turbo decoders versus Viterbi decoders. 

For a rate ^ convolutional code with memory size M, the Viterbi decoding complexity 
per information bit is estimated by 3 • 2 ^ integer additions and 2 • 2^ table look-up op- 
erations: Two branches arrive at every state. For each branch, the metric is found in 
a look-up table. This branch-metric is then added to the total metric of the path that 
the branch belongs to. The total metrics of the two paths that arrive in the same state 
are then compared to determine the surviving path. This comparison is equivalent to a 
subtraction, or an addition. 

For the (2, 1, 14) convolutional code discussed in Section 1.1, the Viterbi decoding complex- 
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ity per information bit equals 49152 integer additions and 32768 table look-up operations. 
It is not trivial to compare the decoding complexity of the Turbo Codes with memory four 
constituent encoders, eighteen decoder iterations, and a 64K interleaver, to the Viterbi 
decoding complexity of this (2, 1, 14) convolutional code. The following considerations 
have to be made: 

• The Turbo decoder requires floating point addition and multiplication, while the 
Viterbi decoder only needs integer addition. 

• The Viterbi decoder requires more operations. 

• Does the decoder need to be optimized for chip size or decoding time? For practical 
systems, the possibility of pipelining operations or reusing calculation units needs to 
be considered. 

The comparison has to be made on the hardware level. 

Note that in order to improve the performance of a convolutional code, the memory size 
of the encoder needs to be increased. At the same time the decoding complexity per 
information bit grows exponentially. This is in contradiction with Turbo Codes where 
the performance can be improved without increasing the decoding complexity by using a 
larger information block size. It can be expected that when the performance requirements 
are high enough, the Turbo Code will outscore convolutional codes in decoding complexity. 

4.4 Conclusion. 

In this chapter, a measure for the decoder complexity of Turbo Codes was discussed. The 
decoding complexity per information bit only depends on the memory size of the con- 
stituent encoders, but not on the information block size. Therefore, the performance of 
Turbo Codes can be increased without also increasing the decoding complexity. 

The decoding complexity of Turbo Codes was compared to the decoding complexity of 
convolutional codes with Viterbi decoder. The Viterbi decoding complexity grows expo- 
nentially with the memory size of the convolutional code. It can be expected that when the 
performance requirements are sufficiently high, the decoding complexity of Turbo Codes 
is lower than that of convolutional codes with Viterbi decoding. 



Chapter 5 


Improvements for TURBO Codes. 


5.1 Introduction. 

The performance of Turbo Codes can be improved by increasing the information block 
size, or equivalently the interleaver size. For a fixed block size, however, the constituent 
encoders and the interleaver can be used to optimize the performance. Especially inter- 
esting are improvements that do not increase the decoding complexity of the code. In the 
previous chapter, it was concluded that the decoding complexity per information bit is an 
exponential function of the constituent encoder’s memory size. Therefore, the goal is to 
optimize the performance over all constituent encoders of a given memory size. 

In this chapter, constituent encoder selection and interleaver design are addressed. 


5.2 Constituent encoder selection. 

This section addresses the problem of constituent encoder selection for Turbo Codes. As 
mentioned in the introduction, it is the intention to optimize the encoder for a given mem- 
ory size. 

According to the results of Section 3.6, free distance codewords result from a single com- 
pleted detour of i nf ormation weight two in the trellis of each constituent encoder, at least 
when the interleaver size is sufficiently large. In Section 3.7, the behavior of the distance 
spectrum is discussed: as the interleaver size increases, every term of the distance spectrum 
converges to a finite value that is determined only by completed detours with information 
weight two. This convergence starts at the ‘low weight end’ of the distance spectrum and 
expands to higher weights as the interleaver size increases. It is therefore correct to state 
that the codewords of relatively low Hamming weight for a Turbo Code are caused by 
completed detours of information weight two. The word ‘relatively’ is used to stress the 
dependence of this statement on the interleaver size. 

It is therefore trivial to use a constituent encoder which maximizes the output Hamming 
weight of completed detours of information weight two. The detours that cause the ex- 
pected value of the free distance now have a larger output weight, so that this expected 
value increases. Compared to a non-optimal constituent encoder, this encoder shifts the 
codewords at the ‘low weight end’ of the spectrum to higher weights. 
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The search for a constituent encoder that maximizes the output weight for completed 
detours with information weight two, intuitively leads to maximum cycle length encoders, 
as will be illustrated in an example. The cycle length of an encoder is the period of the 
output sequence when the input sequence consists of a single one followed by zeros. 
Consider a (2, 1,4) recursive systematic convolutional code with generators (37,21). When 
the sequence 1000 ... is encoded, the output sequences are 1 00000 00000 ... for the sys- 
tematic output and 1 10010 10010 . . . for the parity bits. The cycle length 5 can be 
recognized in the output sequence. When these output sequences are serialised, the result 
is the first basis vector of the code: 

11 0100000100 0100000100 . . . 

This vector is periodic: it contains the periodic sequence 0100000100. The other basis 
vectors can be found as described in Section 2.6.3. Because the encoder is time-invariant, 
they can be obtained by shifting the sequence above. 

The output weight produced by the input sequence 100001 can now be found by adding 
the corresponding basis vectors: 

11 0100000100 0100000100 0100000100 . . . 

+ 00 0000000011 0100000100 0100000100..,, 

= 11 0100000111 0000000000 0000000000 . . . 

The output weight is six. When the sum of two basis vectors for a recursive convolutional 
encoder produces a finite output weight, the input sequence corresponding to these vectors 
causes a completed detour of information weight two in the trellis. Note that in order to 
obtain a finite output weight, the two basis vectors have to be shifted by a multiple of the 
cycle length. The basis vectors corresponding to the sequence 100001 are shifted by one 
cycle length. Each additional shift of the basis vectors over one cycle length results in an 
additional output weight of two. 

Now consider a (2,1,4) recursive systematic convolutional code with generators (23,35). 
The first basis vector is: 

11 010100000001000001010001000101 010100000001000001010001000101 . . . 

It contains the periodic part 01010000000100000101000100010. The cycle length is 15, 
which is the maximum cycle length for memory 4 encoders. The basis vectors correspond- 
ing to the input sequence 1000000000000001 are shifted by one cycle length. Therefore this 
input sequence causes a completed detour in the trellis of the encoder with information 
weight two. Again, the output weight can be calculated by adding the corresponding basis 
vectors: 

11 010100000001000001010001000101 010100000001000001010001000101 . . . 

+ 00 000000000000000000000000000011 0101000000010000010100010001 01 
= 11 010100000001000001010001000110 000000000000000000000000000000 . . . 
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Figure 5.1: The performance of two Turbo Codes with pseudorandom interleaver of size 
400 and puncturing pattern as defined in [1]. One Turbo Code has (37,21) constituent 
encoders, the other (23,35) constituent encoders. 


The output weight is ten. Each additional shift of the basis vectors over one cycle length 
results in an additional output weight of eight. 

The maximum cycle length code produces more output weight for completed detours of 
information weight two. The reason is intuitively clear: For the encoder with larger cycle 
length, the basis vectors have a larger period so that their repeated part can contain more 
ones. It is the weight of the repeated part of the basis vectors that determines the output 
weight of the sum of two basis vectors that are shifted over multiple cycle lengths. 

It would be difficult to prove that maximum cycle length encoders are always optimal: 
the interleaver size has to be large enough so that information sequences with weight two 
are likely to determine the free distance. Note that also the puncturing of the output has 
to be taken into account. Moreover, the proposed method only optimizes the low weight 
end of the spectrum. This causes the error floor to shift down, but does not change the 
performance at low signal to noise ratios. This is illustrated in Figure 5.1. 

Finally, it needs to be mentioned that Robertson proposed the use of the (23, 35) 
generators instead of the (37,21) generators in [3]. Now theoretical results have been 
found that explain why the maximum cycle length encoder is better. 


5.3 Interleaver design. 

The second way to improve a Turbo Code’s performance is by optimizing the interleaver. 
A method for interleaver optimization has been proposed by Robertson in [3]. This method 
optimizes the interleaver for the free distance. The method works as follows: 
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Figure 5.2: The performance of two Turbo Codes with constituent encoders and puncturing 
pattern as in [1]. The interleaver size is 400. One code has an optimized interleaver, the 
other code a pseudorandom interleaver. 




Chapter 6 


The Relation Between TURBO 
Codes and Product Codes. 

When Turbo Codes were first introduced, a few publications suggested that Turbo Codes 
obtain their performance in a similar way as product codes, see [10]. However, there are 
several facts that contradict this theory. 

The encoders for Turbo Codes and product codes contain the same components: two 
constituent encoders and an interleaver. The difference lies in the interconnection of these 
components. In the Turbo encoder, there is a parallel concatenation of the constituent 
encoders as opposed to a serial concatenation for the product encoder. Therefore, an 
information sequence is interleaved for the Turbo Code, and no code sequence as is the 
case for product codes: The second constituent encoder of a Turbo encoder does not 
encode the parity bits produced by the first constituent encoder. However, encoding the 
parity bits of the first constituent encoder is crucial to the performance of product codes. 
A second important difference lies in the aim of the coding schemes. The aim of building 
a product code is to obtain a code with a high free distance. This code is then used 
at moderate and high signal to noise ratios (SNR) where the free distance dominates 
the performance. For Turbo Codes which are used at low SNR, the free distance is just 
one element that determines the performance. As the SNR decreases, more terms of the 
distance spectrum influence the performance of the code. 

An interesting question remains to be solved. For a product code, increasing the interleaver 
size does not increase the free distance. Is it possible to keep increasing the free distance 
by increasing the interleaver size for a Turbo Code with given constituent encoders and 
puncturing pattern? If not, what is the maximum free distance that can be reached, and 
what effect prohibits the further increase? 

It can be concluded that Turbo Codes and product codes are two fundamentally different 
coding schemes. 
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Chapter 7 


Conclusions. 


In Chapter 2, an algorithm was developed to calculate the free distance of a Turbo Code. 
The algorithm was applied to a Turbo Code with a 64K pseudorandom interleaver and to 
a Turbo Code with a 120 x 120 rectangular interleaver. The results led to the following 
conclusion: 

The error floor is caused by the free distance asymptote of the code. In case the free 
distance is low, the free distance asymptote has a small slope at low signal to noise ratios, 
the error floor looks flat. A turbo Code can have a small free distance and still show a 
good performance when the number of free distance codewords is small relative to the 
interleaver size. 

In Chapter 3, the method of random interleaving was corrected to include incompleted 
detours, and extended to punctured Turbo Codes. Using random interleaving, it was de- 
rived that every term of the distance spectrum of the average Turbo Code (averaged over 
all interleavers of a given size) converges to a finite value as the interleaver size increases. 
This has the following consequences: 

The number of free distance codewords for the average Turbo Code converges to a finite 
value as the interleaver size increases, while the free distance stays constant. Therefore, 
the error floor is shifted down as the interleaver size increases. 

The spectr um for the average Turbo Code shows the effect of ‘spectral thinning’. ‘Spectral 
thinning’ can explain the fact that for Turbo Codes with larger interleavers, the error floor 
dominates the performance down to smaller signal to noise ratios. 

According to the theory developed in Section 3.7, the performance of Turbo Codes ap- 
proaches the Shannon limit as the interleaver size increases. 

In Chapter 4, the decoding complexity of Turbo Codes is reviewed and compared to 
the decoding complexity of convolutional codes with Viterbi decoders. The decoding com- 
plexity was defined as the number of operations required to decode one information bit. 
For the Turbo Code, the decoding complexity is only determined by the memory size 
of the constituent encoders. As the interleaver size increases and the performance im- 
proves, the decoding complexity for the Turbo Code stays constant. However, improving 
the performance of a convolutional code simultaneously increases the decoding complexity. 
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CHAPTER 7. CONCLUSIONS. 


In Chapter 5, constituent encoder selection and interleaver design are addressed. The 
proposed methods only lower the error floor, but have no influence on the performance at 
low sigal to noise ratios. The performance of Turbo Codes with ma ximum cycle length 
constituent encoders is explained. 

In Chapter 6, Turbo Codes are compared to product codes. These two coding schemes 
are fundamentally different. 
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